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ABSTRACT 


Research  on  optical  data  processing  for  missile  guidance  and  robotics  is  described.  Our  major 
emphasis  is  pattern  recognition  using  feature  extraction  (Fourier  coefficients,  moments  and  chord 
features)  and  correlation  (using  distortion-invariant  synthetic  discriminant  function  matched  spatial 
filters).  All  of  our  research  in  pattern  recognition  concerns  multi-class  distortion-invariant  processors.  It 
includes  new  algorithms  to  extract  distortion  parameters  from  chord  features  and  a  hierarchical  moment 
feature  processor  for  distortion  parameter  estimation.  Extensive  database  tests  of  moments  and  synthetic 
discriminant  functions  have  been  performed.  Component  research  has  addressed  AO  cells  with 
performance  measures  and  detector  effects  described.  Matrix-vector  research  includes:  error  source 
analysis,  a  new  quadratic  matrix  algorithm,  and  initial  laboratory  system  results  with  attention  to  the 
electronic  support  system  and  the  laboratory  system  fabrication. 


1.  INTRODUCTION 


During  this  first  year  (September  1984  -  September  1985)  of  our  new  research  contract  in  optical 
data  processing  for  missile  guidance,  we  have  addressed  the  major  key  issues  and  aspects  required  and 
associated  with  this  technology.  This  research  includes: 

•  real-time  devices  and  components, 

•  new  system  architectures, 

•  new  algorithms, 

•  new  high-speed  general-purpose  optical  data  processing  techniques  and  systems, 

•  tests  on  new  and  extensive  image  databases, 

•  plus  new  pattern  recognition  techniques,  architectures,  algorithms  and  concepts. 

As  in  past  years,  we  have  been  quite  faithful  in  reporting  our  AFOSR  sponsored  research  in  various 
journals  and  conference  publications.  24  publications  (an  average  of  2  per  month)  have  resulted  from  this 
AFOSR  research  (Chapter  18).  Copies  of  the  more  relevant  papers  we  have  published  over  the  past  year 
are  included  as  various  chapters  of  this  report.  These  are  included  to  provide  complete  documentation  of 
the  different  aspects  of  our  work. 

In  Chapter  2,  we  provide  a  summary  and  overview  of  our  research  progress  achieved  during  the  past 

This  work  addressed  6  vital  areas  of  optical  data  processing  research: 
real-time  spatial  light  modulators  (Section  2.2  and  Chapter  3), 

optical  pattern  recognition  (Section  2.3  and  Chapter  4), 

computer  generated  holograms  (Section  2.4  and  Chapter  5), 

optical  feature  extraction  (Section  2.5  and  Chapters  6-10), 

optical  correlation  (Section  2.6  and  Chapters  11-14),  and 

optical  linear  algebra  processors  (Section  2.7  and  Chapters  15-17). 

Topic  (1)  concerns  the  vital  issue  of  real-time  spatial  light  modulators.  Topics  (2)-(5)  address 
pattern  recognition  for  ATR  using  new  optical  pattern  recognition  (OPR)  techniques.  In  this  work,  we 
have  been  faithful  to  address  vital  problems  such  as  multi-class  distortion-invariant  pattern  recognition  of 
military  targets,  the  acquisition  and  importance  of  the  use  of  a  large  database  and  the  effects  of  noise  on 
the  algorithms  used.  Topic  (6)  concerns  the  most  attractive  item  in  optical  processing  at  present  and  a 
potentially  quite  general-purpose  optical  processor  concept. 


year. 

1. 

2. 

3. 

4. 

5. 

6. 


Details  of  the  more  salient  results  of  our  research  are  provided  in  Chapters  3-17.  In  Chapter  18,  we 
enumerate  our  AFOSR  sponsored  publications,  the  presentations  given  on  this  research  at  conferences  and 
seminars  during  the  past  year,  and  the  Master’s  and  PhD  students  that  this  grant  has  supported. 

Our  level  of  AFOSR  research  support  on  this  grant  has  not  increased  for  several  years  and  our 
optical  artificial  intelligence  separate  research  AFOSR  proposal  was  not  funded.  This  will  significantly 
impact  our  research  program.  Other  funds  are  being  sought  to  allow  support  of  this  research  we  feel  is 
necessary.  The  aforementioned  remarks,  plus  the  unavailability  of  funding  from  Eglin  AFB  for  our 
Kalman  filtering  research  are  expected  to  result  in  a  reduction  in  the  quantity  of  research  we  are  able  to 
produce  for  AFOSR.  We  anticipate  that  we  will  still  remain  considerably  above  the  output  level  of  other 
researchers  however. 

During  the  past  year,  the  principal  investigator  (PI)  presented  invited  talks  on  our  AFOSR 
sponsored  research  at  various  conferences  including  the  Critical  Review  of  Technology  SPIE  Conference 
on  Digital  Image  Processing  and  the  Critical  Review  of  Technology  SPIE  Conference  on  Computer 
Generated  Holograms  (SPIE,  Los  Angeles,  California,  January  1985)  and  the  DoD  conference  on  Parallel 
Algorithms  and  Architectures  for  ATR  (Leesburg,  Virginia,  conference  proceedings  published  February 
1985),  plus  other  OSA  and  SPIE  optical  computing  and  robotic  conferences  during  the  year.  The  PI  has 
chaired  conference  sessions  and  seminars  and  served  on  the  organizing  committees  for  the  following 

conferences  and  topics: 

•  SPIE  (robotics), 

•  Optical  Society  of  America  (optical  computing), 

•  Optical  Society  of  America  (machine  vision), 

•  SPIE  (digital  image  processing), 

•  SPIE  (computer  generated  holograms). 

The  PI  was  also  guest  editor  of  a  special  issue  of  Optical  Engineering  on  robotics  and  computer  vision. 
He  was  invited  to  submit  papers  to  the  journal  Optical  Engineering  special  issues  on  pattern  recognition 
(November  1984),  optical  computing  (January  1985)  and  computer  generated  holograms  (October  1985). 
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2.  OVERVIEW  AND  SUMMARY 

2.1  INTRODUCTION 


Our  six  major  research  areas  and  our  recent  progress  in  each  are  highlighted  in  Sections  2.2  -  2.7. 
Details  of  each  aspect  of  our  fifteen  work  topics  follows  in  Chapters  3  -  17. 


2.2  SPATIAL  LIGHT  MODULATORS  (ACOUSTO-OPTIC  CELLS. 
CHAPTER  3) 
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Recently,  our  spatial  light  modulator  research  has  emphasized  acousto-optic  cells.  In  [l],  we 
considered  the  salient  acousto-optic  architectures  (spectrum  analyzers  and  correlators).  The  various 
acousto-optic  cell  and  acousto-optic  architecture  component  errors  have  been  enumerated,  grouped  into 
different  classes  and  combined  into  several  new  models.  New  performance  measures  for  acousto-optic 
correlators  and  spectrum  analyzers  were  defined  and  detailed  (spectrum  estimation,  delay  estimation,  and 
detection).  Each  is  an  appropriate  performance  measure  for  a  different  application.  General  error-free 
formulae  for  each  of  these  performance  measures  were  derived  and  the  performance  obtained  with  each 
was  described  and  quantified  as  a  function  of  the  various  system  parameters.  Our  new  work  [2]  in  this 
area  (Chapter  3)  addressed  component  error  source  effects  on  performance  (specifically  detector  effects). 
We  plan  to  apply  AO  processors  to  optical  image  processing  in  our  future  research. 

2.3  OPTICAL  PATTERN  RECOGNITION  REVIEWS  (CHAPTER  4) 

Our  AFOSR  optical  pattern  recognition  research  is  at  the  forefront.  Our  paper  [3]  in  Chapter  4  on 
coherent  optical  pattern  recognition  was  included  in  the  recent  Critical  Review  of  Technology  series  on 
Digital  Image  Processing.  A  more  recent  review  [4]  was  one  of  only  two  optical  pattern  recognition 
papers  at  a  recent  DoD  conference  on  parallel  architectures  and  algorithms  for  ATR.  A  journal  OPR 
paper  was  invited  and  published  in  the  Optical  Engineering  issue  on  optical  computing  [5)  in  January 
1985.  Chapter  4  [3]  is  a  complete  review  of  optical  techniques  for  feature  extraction  and  correlation  and 
includes  new  algorithms,  architectures  and  hybrid  optical/digital  processing  concepts.  Sections  2.4-2  6 
and  Chapters  5-14  detail  specific  aspects  of  our  recent  OPR  research. 
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2.4  CGHs  FOR  OPR  (CHAPTER  5) 


Our  1984-1985  research  has  increased  the  use  of  computer  generated  holograms  (CGHs)  for  optical 
pattern  recognition  (OPR).  We  were  selected  to  present  a  review  of  this  area  [6]  in  a  recent  Critical 
Review  of  Technology  conference  on  CGHs.  A  detailed  revised  version  [7]  of  this  paper  was  invited  for 
submission  to  a  upcoming  journal  special  issue.  This  review  will  be  included  in  our  1985-1986  annual 
report.  In  Chapter  5,  we  include  new  recent  work  on  the  use  of  a  CGH  as  a  wedge  ring  detector  for 
diffraction  pattern  sampling  (8). 


2.5  OPTICAL  PATTERN  RECOGNITION  FEATURE  EXTRACTION 
(CHAPTERS  6-10) 

Three  new  optical  feature  extraction  techniques  have  been  detailed  in  our  recent  research: 

1.  the  use  of  multiple  feature  extractors  and  dimensionality  reduction  techniques  (we  consider  the 
specific  case  of  a  wedge  ring  detector-sampled  optically  produced  Fourier  transform  feature 
space)  (Chapter  6  and  Ref.[9j); 

2.  a  new  method  to  measure  distortions  from  a  chord  distribution  feature  space  (Chapter  7  and 
Ref.[l0]);  and 

3.  a  hierarchical  two-level  hybrid  optical/digital  moment  feature  processor  (Chapter  8  and 
Refs. [11]  and  [12]). 

Our  optical  Fourier  transform  space  and  multiple  feature  space  work  (Chapter  6)  includes  four  different 
dimensionality  reduction  and  feature  extraction  techniques.  A  new  classifier  concept,  quantitative  data  on 
the  importance  of  amplitude  versus  phase  Fourier  coefficients  (for  pattern  recognition,  rather  than  image 
reconstruction)  and  the  performance  of  each  in  the  presence  of  noise.  These  represent  quite  novel  results 
which  have  thus  far  not  been  published  for  any  other  feature  extractor  (optical  or  digital).  Experimental 
results  for  two  letters  and  two  vehicles  with  25  images  of  each  at  different  scale  and  in-plane  rotational 
differences  were  obtained.  In  Chapter  7,  new  techniques  to  obtain  distortion  parameters  from  chord 
features  are  detailed  [10]. 

In  Chapter  8,  our  new  hybrid  optical/digital  moment  processor,  our  new  hierarchical  moment-based 
class  estimator  technique,  and  a  new  two-level  classifier  using  moments  are  detailed  and  the  results 
obtained  on  a  set  of  ship  images  are  presented  ( 1 1  ] .  Robotic  part  data  on  the  same  system  are  contained 
in  Ref.(12],  The  performance  of  this  system  on  non-controlled  imagery  and  a  new  segmentation 


processing  technique  were  recently  published  [13]  and  are  included  in  Chapter  9  for  completeness.  The 
accuracy  with  which  the  distortion  parameter  estimates  can  be  obtained  is  summarized  [14]  in  Chapter 


2.6  OPTICAL  PATTERN  RECOGNITION  CORRELATORS  (CHAPTERS 


11-141 


Our  distortion-invariant  multi-class  multi-object  correlator  research  emphasizes  synthetic 


discriminant  functions  (SDFs).  Our  tests  and  algorithms  for  projection  SDFs  on  ship  images  with  data  on 


noise  performance  with  new  guidelines  for  the  suggestion  of  projection  values  were  included  in  a  recent 


journal  special  issue  on  pattern  recognition  [15]  and  are  provided  in  Chapter  11.  New  related  SDFs  that 


optimize  various  performance  measures  [16]  are  detailed  in  Chapter  12.  New  correlation  SDFs  have  been 
described  and  initial  results  with  them  have  been  obtained  for  a  tank  and  APC  image  database  [17]. 


These  results  are  summarized  in  Chapter  13.  We  were  directed  to  perform  tests  on  aircraft  images  by 


AFOSR.  These  results  [18]  are  included  in  Chapter  14. 


2.7  OPTICAL  LINEAR  ALGEBRA  PROCESSORS  (CHAPTERS  15  -  171 


This  optical  data  processing  application  area  has  received  very  much  recent  attention. 


A  first  vital  aspect  of  optical  linear  algebra  research  that  we  initiated  was  the  error  source  modeling 


and  simulation  of  OLAP  (optical  linear  algebra  processor)  architectures  and  algorithms  [19].  Chapter  15 


details  this  work.  A  second  novel  facet  of  our  OLAP  research  has  concerned  specific  applications.  The 


application  chosen  for  major  attention  was  Kalman  filtering  and  the  specific  application  of  it  was  missile 


guidance,  control  and  state  estimation.  Support  for  future  research  in  this  area  is  questionable  at  present. 


A  third  facet  of  our  research  is  new  parallel  algorithms.  A  new  parallel  algorithm  for  the  solution  of 


quadratic  nonlinear  matrix  equations  using  a  finite  number  of  steps  has  been  devised  [20]  and  is  detailed 


in  Chapter  16. 


The  fourth  and  final  aspect  of  our  OLAP  research  has  been  attention  to  fabrication  of  an  OLAP. 


We  recently  [21]  discussed  our  laboratory  processor  and  its  electronic  support  and  initial  results.  This  is 


detailed  in  Chapter  17.  A  lengthy  version  of  this  work  is  in  preparation  for  a  journal  special  issue.  This 
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Reprinted  from  Applied  Optic*,  Vol.  24,  page  1224,  April  IS,  198S 
Copyright  ©  1985  by  the  Optical  Society  of  America  and  reprinted  by  permission  of  the  copyright  owner. 


Detector  effects  on  time-integrating  correlator 
performance 


Anastasios  Goutzoulis,  David  Casasent,  and  B.  V.  K.  Vijaya  Kumar 


Detector  array  effects  are  considered  for  a  time-integrating  acoustooptic  correlator  used  for  signal  detection. 
Effects  such  as  detector  area  integration,  detector  element,  spatial  response,  and  the  location  of  the  correla¬ 
tion  peak  within  a  detector  element  are  included.  General  SNR,  Pd,  and  Pp*  expressions  are  derived  as  a 
function  of  various  system  and  detector  parameters.  Quantitative  data  are  provided  for  a  Gaussian -Markov 
signal,  and  initial  experimental  confirmation  is  included. 


I.  Introduction 

Acoustooptic  (AO)  devices  have  been  suggested  for 
use  in  many  new  signal  processing  architectures  and 
applications.1  This  interest  is  motivated  by  the  com¬ 
mercial  availability,  good  reliability,  and  performance 
of  new  AO  cells.1-2  One  of  the  most  attractive  AO  signal 
processors  is  the  time-integrating  (TI)  correlator.3-4 
This  architecture  is  attractive  because  of  the  large 
processing  gain  it  provides  and  the  large  signal  devia¬ 
tions  it  can  accommodate.  However,  only  limited  sta¬ 
tistical  analyses,5  error  source  consideration,6  and 
quantitative  performance  data  have  been  published  on 
this  system.  Published  work  has  considered  the  effects 
of  signal  time  bandwidth  product  (TBWP),  input  signal 
noise,  detector  noise,4  and  finite  detector  area  effects 
on  time  delay  estimation  applications.7 

In  this  paper  we  consider  detector  effects  in  a  signal 
detection  application  of  a  TI  correlator.  We  consider 
correlators  using  AO  cells  operated  in  the  linear  inten¬ 
sity  mode  (since  these  architectures  yield  analytical 
results).  Detector  effects  for  AO  cells  operated  in  the 
amplitude  mode  can  be  analyzed  following  the  proce¬ 
dures  and  models  advanced  herein.  In  Sec.  II,  we  re¬ 
view  the  linear  intensity  TI  correlator  and  derive  an 
expression  for  its  output  including  the  finite  area  D  of 
each  detector  element  (and  the  associated  spatial  in¬ 
tegration  and  sampling),  the  spatial  weighting  function 


When  this  work  was  done  all  authors  were  with  Carnegie-Mellon 
University,  Department  of  Electrical  &  Computer  Engineering, 
Pittsburgh,  Pennsylvania  15213;  A.  Goutzoulis  is  now  with  West- 
inghouse  Research  &  Development  Laboratories,  1310  Beulah  Road, 
Pittsburgh,  Pennsylvania  15235. 

Received  1 1  October  1984. 

0003-6935/85/081224- 10$02.00/0. 

©  1985  Optical  Society  of  America. 


wn(r)  for  an  individual  detector,  and  conventional  sig¬ 
nal  and  system  parameters.  Our  performance  measures 
used  are  probability  of  detection  Pp  and  probability  of 
false  alarm  Py Our  prior  statistical  analysis6  related 
these  to  measurable  correlator  SNR  values  and  showed 
that  these  factors  completely  characterize  the  system’s 
performance.  We  do  not  consider  detector  noise  and 
detector  element  cross  talk,  since  earlier  detector  noise 
analyses4  can  easily  include  such  effects.  The  statistics 
(mean  and  variance)  of  the  correlator’s  output  are  then 
evaluated  in  Sec.  Ill  for  the  case  of  Gaussian -distributed 
signal  and  noise.  In  Sec.  IV,  performance  expressions 
are  derived  for  the  case  of  Gaussian-Markov  signals. 
The  effect  of  the  finite  detector  area  (Sec.  V),  the  loca¬ 
tion  of  the  correlation  peak  within  a  detector  element 
(Sec.  VI),  and  spatial  weighting  across  each  detector 
(Sec.  VII)  are  then  analyzed,  and  quantitative  analytical 
results  are  provided.  Brief  experimental  results  are 
included  (Sec.  VIII),  and  then  our  summary  and  con¬ 
clusions  are  advanced  (Sec.  IX)  on  the  design  of  a  TI 
correlator  for  detection  applications.  Signal,  systems, 
and  output  detector  parameters  are  considered 
throughout.  Emphasis  is  given  to  our  general  analyses, 
the  quantitative  effect  of  different  parameters,  and  the 
analyses  of  various  initial  quantitative  results. 

II.  TI  AO  Correlator  for  Signal  Detection 

A  simplified  schematic  of  a  linear  intensity  TI  AO 
correlator3  is  shown  in  Fig.  1.  We  denote  the  reference 
signal  by  s(t)  and  the  received  signal  by  s(t  —  r0)  +  n(t ), 
where  r0  is  the  delay  and  n(t)  is  additive  noise.  For 
linear  intensity  modulation3  of  the  AO  cells,  the  signals 
are  added  to  two  biases  B  i  and  B 2.  The  signal  driving 
the  point  modulator  at  P 1  is 

S2U)  =  B2  +  s(t  -  t0)  +  n(t).  (1) 

The  light  intensity  leaving  Pi  is  then  proportional  to 
s 2(1)-  This  light  beam  is  expanded  by  lens  L]  and 
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Fig.  1.  Schematic  of  a  time-integrating  acoustooptic  correlator. 


uniformly  illuminates  the  AO  cell  at  P2.  The  signal 
Si(t)  =  Bi  +  s(t )  modulates  a  rf  carrier  and  drives  the 
AO  cell  at  P2.  Lenses  L2,  and  the  spatial  filter  at  P3 
separate  the  diffracted  and  undiffracted  orders,  block 
the  undiffracted  order,  and  image  the  +1  diffracted 
order  onto  a  linear  detector  array  at  plane  P4.  The 
detector  array  at  P4  provides  the  time  integration  over 
Tj  of  the  resulting  light  intensity  s1(t)s2(t)-  Including 
the  finite  area  D  of  the  detector  elements,  we  write  the 
P4  output  from  the  nth  detector  as 


1  D  77/2  ln+  1/2)0 

I(n)  =  — -  I  (  ic„(r)[B, +s(f  -  r)] 

Tt  J-Ti/2  J(n- 1/2)0 


X  [B2  +  s(t  —  r0)  +  n(f  )]drd<,  (2) 

where  r  =  x/vs,  x  denotes  the  direction  of  the  sound 
propagation,  vs  is  the  speed  of  sound  in  the  AO  crystal, 
w„(t)  is  the  spatial  response  weighting  function  for  the 
nth  detector  element,  and  n  =  —N/ 2, . . .  ,0, . . .  ,N/2  is 
the  index  for  the  iV  +  1  detectors.  We  note  that  D  = 
DJvs  has  units  of  time.  (Ds  is  the  detector  area  in 
distance  units.)  The  normalization  factor  1/T7  is  in¬ 
cluded  to  simplify  our  results  and  does  not  affect  the 
system’s  detection  performance. 

Equation  (2)  contains  all  parameters  necessary  to 
study  the  effects  of  all  detector  parameters  [i.e.,  D,wn(r) 
as  well  as  the  value  of  r0  with  respect  to  D]  and  various 
signal  and  system  parameters  (such  as  T/,  signal 
bandwidth,  and  TBWP)  on  the  system’s  performance. 
Other  AO  system  component  errors  can  be  treated  in¬ 
dividually  in  the  input  or  frequency  plane  as  shown 
earlier.6  Dead  spaces  between  detector  elements  can 
be  included  by  allowing  wn(r)  to  become  zero  at  the 
edges  of  each  detector  element.  In  our  analyses,  we 
assume  1:1  imaging  from  P 2  to  P4  in  Fig.  1.  Operation 
of  the  AO  cell  in  the  linear  amplitude  mode  is  also  pos¬ 
sible.  In  this  case,  the  correlation  output  is  present  on 
a  spatial  carrier,  and  after  postdetection  processing  the 
correlation  obtained  is  still  given  by  Eq.  (2)  with  a  dif¬ 
ferent  signal-to-bias  ratio.  Thus,  our  results  can  be 
extended  to  apply  to  both  amplitude  and  intensity  mode 
AO  cell  operation. 

As  performance  measures,  we  use  the  parameters6 
SNRi,  SNR2,  Pd,  and  Pfa-  SNR|  is  the  typical  SNR 
measure6  used  in  communications  (the  square  of  the 
ratio  of  the  average  correlation  value  at  the  peak  to  the 


standard  deviation  in  the  peak  value).  SNR2  is  the 
same  as  SNRi  except  the  standard  deviation  is  com¬ 
puted  far  from  the  peak.  (It  is  thus  similar  to  the 
peak-to-sidelobe  ratio.9)  The  probability  of  detection 
Pd  and  probability  of  false  alarm  P fa  are  related  to 
these  two  SNR  measures  by6 

p _ 1  cm  /-SNRil*  -  £[0(0)11^  , 

°  v/27r£2[C(0)]/SNR1  Je  6XP  \  2£2[C(0)]  /  X’ 

(3) 

p_ _ 1  r~  /— SNR2|x  -  £[C(t)|P\  , 

FA  v'27r£:2iC(0)l/SNR2  Je  6XP  \  2£"i[C(0)]  /  *’ 

(4) 


where  E[C(0)]  and  E[C(r)J  are  the  means  of  the  signal 
and  noise,  respectively,  and  6  is  the  detection  threshold. 
By  increasing  6,  PFA  will  be  reduced,  but  Pd  will  also 
decrease.  Note  that  E\C(t)\  and  £[C(0)]  can  be  esti¬ 
mated  by  evaluating  the  correlation  C(r)  far  from  the 
peak  r  »  0  and  at  the  peak  r  =  0,  respectively.  We 
choose  to  express  Pd  and  PpA  in  terms  of  SNRi  and 
SNR2  because  of  the  considerable  ease  with  which  these 
two  SNR  terms  can  be  measured  experimentally  on  an 
optical  correlator.  In  our  statistical  analysis  in  Sec.  Ill, 
we  derive  expressions  for  SNR!  and  SNR2  and  from 
these  obtain  Pd  and  P fa  expressions. 


III.  Statistical  Analysis 

To  simplify  our  statistical  analysis,  we  assume  uni¬ 
form  weighting  across  each  detector,  ic„(t)  =  1,  and 
equal  biases  Pi  =  P2  =  P  and  that  constant  bias  terms 
are  subtracted  from  the  P4  output.  Equation  (2)  for  the 
nth  detector  output  now  contains  the  following  five 
terms: 


,  B  c 

*  Ti/2  , 

•»  (r»+  1/2)D 

iw  =—  1 

( 

s{t  —  T)drdt 
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1 
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T, 

J-Ti/2 

•/(n  — 1/2)0 

For  the  case  of  zero-mean  independent  signal  s(t)  and 
noise  n(f),  the  square  of  the  expected  value  involves 
only  the  last  terms  in  Eq.  (5),  i.e., 


£2|/(n)| 


I  n  +  I  /'i 
In  — 1/21/ 


)[) 

l/J 


B,(r  -  Tn)dT 


(6) 


where  R,  is  the  signal  autocorrelation  function.  The 
variance  of  l(n)  is  found  from  Eq.  (5)  to  be 
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var [IM]  -  £[/(n)]2  -  £2(/(n)] 
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where  Gaussian-distributed  signals  were  assumed 
(third-order  moments  are  zero,  and  the  fourth  moment 
theorem8  can  be  used)  and  where  Rn(r)  is  the  noise 
autocorrelation  function.  Assuming  that  the  signal  and 
noise  have  similarly  shaped  autocorrelation  functions, 
Eq.  (7)  simplifies  to 


(7) 


B2  rT‘  /»/»<»+ 1/2)0  B2D2  I  1  \  /*T/ 

var[/(«)l  =  -  /  r  JJn_i/2)D  (77  -  M )B,(2  -  r  +  r’)drdr’dz  +  —  (l  +  — )  /_T,  <T'  “  W  «•<*>* 

l  /  i  \  rT'  rrin+ 1/2>D 

+  rg  1 1  +  ~  (  (I  (T,  -  \z\)R,(z)B.(z  -  T  +  T')dTdr'dz 

Tf\  SNR/)  J  -  t,  J J  (n  - 1/2)0 

> 

(7/  -  |j|)B,(2  -  r  +  r0)R,(z  +  t'  -  To)drdT'dz 
> 

(7/  -  |2|)B,(2  -  r  +  To)drdz, 


1  /*T/  ,y<n+ 1/2)0 

Tf  J-T,  Jj(n-1/2)D 
2B2D  rTl  /'(n+1/2  )D 
T]  J-T,  Jin- 1/2)0 


(8) 


where  the  input  SNR/  is  the  ratio  of  the  peak  signal 
power  to  the  peak  noise  power.  If  the  assumption  of 
similar  correlation  functions  for  the  signal  and  noise  is 
removed,  the  SNR/  expression  can  be  appropriately 
modified.10  Assuming  7/  »  1//3,  where  /S  =  BW$  is  the 
signal  bandwidth,  we  can  omit  the  |  z  |  factors  in  Eq.  (8) 
and  obtain 
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(9) 


*  (n  — 1/2)0 

With  no  loss  of  generality,  we  assume  that  the  cor¬ 
relation  peak  occurs  at  the  n  =  0  detector  element. 
Then 


£2(/(0)| 


/.D/2 

1  R.(t  - 

J-D/2 


to)  dr 


(10) 


where  now  —(D/2)  <  r0  <  (D/2).  The  variance  at  the 
peak  var[/(0))  is  thus  given  by  Eq.  (9)  with  n  =  0  and  the 
variance  far  from  the  peak  var[/(n)]  by  neglecting  the 


fourth  term  in  Eq.  (9).  SNRi  and  SNR2  can  now  be 
obtained  from  the  ratio  of  Eq.  (10)  to  var  [/(O)]  and  Eq. 
(10)  to  Eq.  (9),  respectively.  A  numerical  evaluation 
shows  that  the  fourth  term  in  Eq.  (9)  has  a  negligible  3% 
contribution  to  the  total  variance  far  from  the  peak. 
This  is  logical  because  Rs  is  sharply  peaked  and  because 
the  two  factors  in  term  four  diverge  as  r  changes.  For 
generality,  we  retain  all  terms  in  Eq.  (9). 

IV.  Gausslan-Markov  Case  Study 

We  now  use  the  results  of  our  statistical  SNR  analysis 
in  Sec.  Ill  to  derive  Pd  and  PpA  expressions.  We  con¬ 
sider  the  case  of  signals  with  a  Gaussian -Markov  auto¬ 
correlation  function11: 

R«(z)  =  Roexp(-/3|z|),  (11) 

where  /3  is  the  signal’s  3-dB  bandwidth  and  R0  is  the 
signal  power.  This  signal  model  was  chosen  because  it 
allows  an  analytical  evaluation  of  both  SNR/  and  SNR2 
without  the  need  for  numerical  evaluation.  We  have 
also  numerically  evaluated  our  results  for  a  Gaussian- 
shaped  autocorrelation  signal  model  and  obtained  re¬ 
sults  similar  to  those  obtained  herein,  where  we  include 
only  the  analytical  results  for  the  Gaussian-Markov 
model. 

Using  the  model  in  Eq.  (11),  the  average  peak  power 
in  Eq.  (10)  can  be  shown  by  a  simple  but  tedious  analysis 
to  be 


1226  APPLIED  OPTICS  /  Vol.  24.  No.  8  /  15  April  1985 


£2|[/(0)]  =  ~[2-  exp[-/3(D/2  +  r0)]  -  exp[/3(-D/2  +  r0)]|2. 


(12) 

The  var[/(0)]  and  var[/(n)]  expressions  now  become 


RqB2D2  I  2  \  4 R&l  1  1 

7’,/S  l  SNR,)  7',/S2  (  SNR,, 

+^Ai+^(1+iik)A2 

(13) 

Ir/  „  fioS2D2  /  2  \  Rl  I  l  \ 

Tiff  \  SNR,)  Tit S2l  SNR,) 

As,  (14) 

where 
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P  P 

(15b) 

6  6 

A3  =  4D  -  -  +  -exp(-/SD)  +  2D  exp(-/3D). 

P  P 

(15c) 

Prom  Eqs.  (12)— (15),  we  find 


Fig.  2.  Effect  of  detector  size  D  and  integration  time  7)  on  P,a  (for 
Pd  =  0,999). 
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SNR2  = 
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(17) 

where  SBR  =  is  the  signal-to-bias  ratio  for  the 

input  data  to  the  AO  cell.  The  error-free  SNRi  and 
SNR2  expressions  are  found  (by  applying  l’Hopital’s 
rule  with  D  =  0  and  r0  =  0)  to  be 


SNRi  - — - (18) 

(2  +  SNR,)  +  (8  +  SNfy)  (SBR)2 
SNR2  = - — - (19) 

( 1  +  — — j )  +  (8  +  — — — |  — - — 

\  SNR,/  \  SNR,)  (SBR)2 
These  error-free  expressions  are  useful  for  measuring 
the  loss  incurred  when  D  ^  0  and  r0  ^  0. 

From  Eqs.  (12),  (16),  and  (17),  we  can  now  quantify 
the  Po  and  Pfa  performance  to  be  expected  as  a  func¬ 
tion  of  the  different  signal  and  system  parameters  and 
the  different  detector  effects.  Pp  is  obtained  by  sub¬ 
stituting  Eqs.  (12)  and  (16)  into  Eq.  (3),  and  Pfa  is 
found  by  substituting  Eqs.  (12)  and  (17)  intoEq.  (4).  In 
calculating  P fa  we  assume  £(C(r)j  =  0.  This  follows 
from  our  zero-mean  signal  and  noise  assumption  and 
the  fact  that  fi,(r)  will  be  sharply  peaked.  We  also  note 
that  the  P fa  we  calculate  corresponds  to  Pfa  for  one 
detector  element.  The  total  P fa  for  the  entire  output 
(Pfat)  of  N  +  1  detectors  can  be  obtained  from  our  Pfa 
by 


Pfat=1-(1-Pfa)jv+1.  (20) 

The  three  detector  effects  we  consider  are  the  finite 
detector  size  D  (Sec.  V),  the  location  of  the  correlation 
peak  within  a  detector  element  (Sec.  VI),  and  the  spatial 
response  across  a  detector  element  (Sec.  VII).  Each  of 
these  detector  effects  is  treated  separately,  since  for 
each  case  the  integration  time  71/,  the  input  SNR/,  the 
signal  bandwidth  0  =  BWS,  the  signal-to-bias  ratio 
(SBR),  and  other  such  parameters  affect  the  results. 
Our  purpose  in  these  next  three  sections  is  to  quantify 
the  effect  of  these  various  parameters  on  the  detection 
performance  (measured  through  PD  and  Pfa)  of  a  linear 
intensity  TI  AO  correlator  and  to  provide  guidelines  for 
TI  AO  correlator  design. 

V.  Area  Integration  Effects 

In  this  section,  the  effect  of  the  finite  detector  ele¬ 
ment  size  is  quantified.  Graphic  presentations  are  used 
to  provide  quantitative  performance  data.  The  trends 
observed  are  then  noted  and  discussed.  We  include 
only  P fa  data  rather  than  PD  data  to  reduce  the  length 
of  our  text. 

In  Fig.  2  we  show  the  variation  of  Pfa  with  D  for 
different  T7  values.  Both  P fa  and  Pp  improve  as  T / 
increases  (as  expected  since  longer  integration  time 
reduces  noise  and  enhances  signal).  PFa  and  Pp  also 
improve  as  D  decreases.  This  is  less  immediately  ob¬ 
vious  but  can  be  explained  by  realizing  that  increasing 
n  increases  the  noise  more  than  the  signal  (per  detector 
element).  This  occurs  since  the  noise  is  relatively 
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Fig.  3.  Effects  of  input  SNR/  (amplitude)  and  detector  size  D  on 
PFA  (for  PD  =  0.999). 


uniform  over  the  correlation  plane,  whereas  the  signal 
correlation  is  of  narrow  and  finite  width.  Thus,  for  any 
D  >  0  (not  just  for  D  greater  than  the  width  of  the  cor¬ 
relation  peak),  a  larger  D  degrades  Pfa  and  Pp  perfor¬ 
mance.  This  effect  is  more  pronounced  when  D  is 
larger  than  the  width  of  the  correlation  peak  (123  pm 
for  BW,  =  10  MHz  and  the  shear  Te02  AO  cell  as¬ 
sumed).  The  data  in  Fig.  2  verify  this  and  quantify  this 
effect. 

As  noted  at  the  outset  that  many  system  and  signal 
parameters  exist  and  affect  performance.  Next  we 
consider  the  effect  of  D  and  input  SNR/  on  Pfa-  As 
expected,  we  find  (Fig.  3)  that  Pfa  improves  as  SNR/ 
increases  (for  a  fixed  D).  For  the  case  considered  (T/ 
=  50  psec,  BWS  =  10  MHz,  or  TBWP  =  500),  we  find 
that  a  smaller  D  is  needed  (and  oversampling  of  the 
correlation  is  required)  when  SNR/  is  low  (below  1.0  for 
the  case  chosen).  For  example,  if  P fa  =  0.001  is  desired 
(with  PD  =  0.999),  the  detector  size  must  satisfy  D  <  18 
pm  if  SNR/  =  0.1.  ( D  =  18  pm  is  much  less  than  the 
123-pm  width  of  the  correlation  peak.)  We  note  (from 
Figs.  2  and  3)  that  T/  and  SNR/  have  a  much  more 
significant  effect  on  P fa  than  does  D.  For  example,  for 
D  =  70  pm,  doubling  T/  from  50  to  100  psec  (Fig.  2) 
results  in  a  quite  significant  P fa  improvement  (from 
10~2  to  10-5).  Conversely,  reducing  D  by  a  factor  of  2 
to  35  pm  improves  Pfa  from  10-2  to  only  2  X  10~3. 
Thus,  as  a  general  system  design  guideline,  if  the  desired 
P fa  for  a  given  SNR/  cannot  be  achieved  with  a  rea¬ 
sonable  D,  a  slight  increase  in  71/  can  often  overcome 
Finite  detector  element  effects  (assuming  that  the  signal 
duration  is  sufficient).  For  large  SNR/,  the  size  D  is  of 
concern.  However,  low  SNR/  is  the  scenario  of  most 
concern. 


As  our  next  signal  parameter,  we  consider  the  effect 
of  D  and  the  signal  bandwidth  BWS  on  Pfa-  Our  re¬ 
sults  are  shown  on  Fig.  4.  Recall  that  the  width  Dc  of 
the  correlation  peak  decreases  as  BW„  increases,  spe¬ 
cifically  Dc  =  (2/BW s)ua.  For  BWS  =  1  MHz,  Dc  = 
1230  pm,  and  all  D  values  shown  are  much  less  than  Dc, 
and  hence  the  variation  of  Pfa  with  D  is  neglibible.  As 
BWS  increase,  Pfa  improves  (due  to  the  increased 
TBWP).  For  BWS  >  10  MHz  achieving  Pfa  <  10-3  is 
easy  for  a  wide  range  of  detector  sizes  D.  For  BWS  = 
10  MHz,  Dc  =  123  pm,  and  we  see  that  any  detector  size 
D  <  100  pm  (or  D  less  than  approximately  Dc)  yields 
good  Pfa  <  10-3  performance.  However,  as  D  is  in¬ 
creased  further,  the  degradation  in  Pfa  is  more  severe 
for  larger  BWS  (since  the  width  of  the  correlation  peak 
becomes  increasingly  less  than  the  width  D  of  a  detector 
and  thus  more  correlation  noise  enters  the  detector). 
For  BWS  =  40  MHz,  Dc  =  31  pm,  and  for  any  D  <  120 
pm  we  find  Pfa  <  10-3.  Thus,  as  BWS  increases,  the 
maximum  allowable  D  for  a  given  Pfa  increases.  This 
occurs  because  the  improvement  in  Pfa  (with  increasing 
BWS)  is  larger  than  the  degradation  in  PpA  (with  in¬ 
creasing  D).  For  lower  SNR/  cases,  smaller  D  values 
than  those  shown  are  expected  to  be  required  (as  we 
found  in  Fig.  3).  For  BWS  =  5  MHz,  Dc  =  246  pm,  and 
we  find  that  D  <  35  pm  (one-seventh  of  the  width  of  the 
correlation  peak)  is  required  to  obtain  PF a  <  10-3. 
Thus,  as  BWS  decreases,  we  require  finer  sampling  of 
the  correlation  peak  to  maintain  a  given  PF a- 

The  number  of  detector  samples  required  within  the 
correlation  peak  and  the  Pfa  obtainable  thus  interact 
significantly  as  BWS  varies.  The  quantitative  data  in 
Fig.  4  show  this  clearly  (for  the  SNR/  and  T/  values 


Fig.  4.  Effects  of  bandwidth  BW,  and  detector  size  D  on  Pfa  (for 
Ti  =  100  nsec,  SNR/  =  0.1.  SBR  =  <=). 
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Fig.  5.  Effects  of  signal-to-bias  ratio  and  detector  size  D  on  Pfa  for 
PD  =  0.999. 

selected).  This  vividly  demonstrates  the  importance 
of  obtaining  such  plots  for  the  parameters  of  the  signal 
of  concern.  Without  this,  the  detector  sampling  re¬ 
quired  for  a  given  PF A  would  be  quite  difficult  to  assess. 
In  general,  for  signals  with  large  TBWP  >  1000  and 
moderately  low  SNR/  >0.1,  the  detector  size  can  be 
chosen  to  be  less  than  or  equal  to  the  width  of  the  cor¬ 
relation  peak,  and  excellent  Pfa  <  10~3  will  result.  For 
signals  with  moderate  TBWP  =  500,  increased  over- 
sampling  of  the  correlation  plane  is  required. 

Last,  we  consider  how  D  and  the  final  and  most 
dominant  system  parameter  (the  SBR  of  the  input  data 
to  the  AO  cell)  affect  our  Pfa  performance  measure. 
Recall  that  SBR  =  °°  for  operation  of  the  AO  cell  in  the 
amplitude  modulation  mode  and  that  the  best  value  for 
the  intensity  modulation  mode  is  SBR  =  0.5.  In  Fig. 
5,  we  show  how  Pfa  varies  with  D  and  SBR.  We  im¬ 
mediately  note  that  the  amplitude  modulation  mode 
(SBR  =  “>)  yields  much  better  performance  for  any  D 
value  and  allows  much  larger  D  values.  This  must  be 
qualified  by  noting  that  the  output  of  a  TI  correlator 
appears  on  a  spatial  carrier4  when  the  AO  cells  are  op¬ 
erated  in  the  linear  amplitude  mode.  Thus  the  detector 
size  in  this  case  must  be  sufficiently  small  to  detect  the 
spatial  carrier.  (This  effect  is  not  included  in  our 
present  data.)  However,  in  a  detection  (compared  to 
a  delay  estimation)  application  of  a  correlator,  we  often 
know  where  the  correlation  will  occur  (once  we  are  in 
synchronization),  thus  considerably  reducing  the 
number  of  detectors  required. 

From  Fig.  5  we  see  that  Pfa  degrades  as  D  increases 
(as  explained  before).  The  decrease  in  Pfa  perfor¬ 
mance  as  SBR  decreases  is  due  to  the  increase  in  the 


signal-dependent  noise  present  in  the  output  of  a  TI 
correlator.  This  bias  cannot  be  simply  subtracted  form 
the  system’s  output.  The  slope  of  the  SBR  =  0.5  curve 
in  Fig.  5  is  comparable  with  that  of  the  high  BWS,  SNR/ 
and  Tj  curves  in  our  prior  (SBR  =  °°)  figures.  However, 
the  associated  D  values  are  an  order  of  magnitude 
smaller.  Thus  quite  small  detectors  and  quite  fine 
correlation  plane  sampling  are  required  for  intensity 
mode  TI  AO  correlators  operation.  For  example,  for 
the  signal  considered,  the  width  of  the  correlation  peak 
is  31  pm,  whereas  the  maximum  detector  size  for  Pfa 
=  0.001  is  24  pm  or  approximately  the  width  of  the 
correlation  peak  (for  SBR  =  0.5).  A  change  in  D  by 
only  5-19  pm  (with  SBR  =  0.5)  changes  Pfa  from  10~3 
to  10-4.  For  the  smallest  realistic  10-pm  detector  size 
shown,  the  P fa  values  obtained  are  quite  large  (for  SBR 
<0.4).  Thus  a  finite  detector  size  significantly  affects 
P fa  performance  for  intensity  mode  AO  operation. 
The  low  SNR/,  the  large  T/  and  large  BWS  scenario 
used  in  Fig.  5  is  typical  of  most  spread  spectrum  signal 
cases. 

VI.  Effects  of  Correlation  Peak  Location 

In  this  section,  we  consider  our  second  detector  effect 
(the  location  tq  of  the  correlation  peak  within  one  de¬ 
tector  element  of  finite  area  D ).  We  first  consider  P fa 
as  a  function  of  delay  r0  (where  —D/2  <  ro  <  D/2) 
within  one  detector  element  for  several  signal  band- 
widths  BWS  and  several  detector  sizes  D.  A  delay  r0 
=  0.0  corresponds  to  a  correlation  peak  located  in  the 
center  of  a  detector  element,  whereas  a  delay  of  ±0.5 
corresponds  to  a  peak  located  at  the  edge  of  a  detector 
(between  two  detectors).  In  Fig.  6,  we  summarize  our 
quantitative  Pfa  performance  as  a  function  of  D  and 


Fig.  6.  Effects  of  the  BW,  [)  product  and  the  location  (delay)  of  the 
correlation  peak  within  one  detector  (as  a  percent  of  I) I  on  /’fa  tfor 
/’„  =  0.999). 
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BW,.  Curves  1-7  correspond  to  systems  with  in¬ 
creasing  BW,  values  and  different  D  choices.  For  the 
systems  considered,  the  correlation  width  in  seconds  for 
a  signal  of  bandwidths  B  W,  is  2/B Ws .  For  a  TeC>2  cell 
with  1:1  imaging  a  10-pm  detector  size  corresponds  to 
a  16-nsec  sampling  time  per  detector.  Specifically,  a 
BW,  =  10-MHz  signal  has  a  correlation  width  of  2/10 
MHz  =  200  nsec.  Thus  curve  3  corresponds  to  200/32 
s  6  detector  samples  within  the  correlation  width  and 
curve  4  to  3 12  samples.  Curves  1  and  2  have  consid¬ 
erably  more  samples.  Curve  5  has  100/32  3  3  samples. 
Curve  6  has  1.5  samples,  and  curve  7  has  3  samples.  As 
BW,  increases  (curves  1-7),  Pfa  improves  due  to  the 
larger  signal  TBWP.  As  the  detector  size  becomes  less 
(curves  3  vs  4  and  6  vs  7),  the  correlation  plane  sampling 
is  better,  and  Pfa  again  improves.  F or  larger  B Ws ,  the 
improvement  due  to  smaller  D  values  is  larger  since  the 
correlation  peak  is  narrower  (see  Fig.  4).  The  variation 
in  Py a  vs  t0  (center  of  detector)  to  to  =  0.5 D  (edge  of 
detector)  also  follows  logically.  As  To  increases  for  a 
fixed  D,  the  correlation  peak  power  within  D  decreases 
(assuming  D  is  less  than  the  width  of  the  correlation 
peak).  As  BW,  increases,  the  sensitivity  to  the  location 
of  the  peak  within  a  detector  element  becomes  more 
important  (since  the  width  of  the  peak  is  less).  The 
variation  in  Pfa  with  To  is  most  severe  for  signals  with 
a  large  BW.,  (or  correspondingly  large  TBWP).  Al¬ 
though  Pfa  is  much  better  for  large  BW, ,  this  t0  effect 
is  still  quite  significant.  For  D  =  20  pm  and  BW,  =  40 
MHz  (curve  6),  Pfa  varies  from  0.0002  (when  r0  =  0)  to 
0.02  (when  r0  =  D/2).  This  is  a  non-negligible  factor 
of  100  loss  in  performance.  Thus,  for  higher  BW,  sig¬ 
nals,  increased  correlation  plane  sampling  is  recom¬ 
mended  (e.g.,  curve  6  is  curve  7). 

The  data  in  Fig.  6  were  obtained  for  intensity  mod¬ 
ulation  (SBR  =  0.5).  Similar  trends  are  expected  for 
linear  amplitude  modulation  (i.e.,  SBR  =  <=°),  but  the 
actual  Pfa  values  will  be  better  (since  the  effects  of  fi¬ 
nite  SBR  are  absent). 

VII.  Effects  of  Detector  Spatial  Weighting  Function 

Thus  far,  the  spatial  weighting  function  for  each  de¬ 
tector  element  has  assumed  a  rect  function.  However, 
this  is  not  necessarily  the  case,  especially  for  CCD  ar¬ 
rays.1'2  In  many  cases,  the  response  profile  for  a  de¬ 
tector  element  can  be  modeled  as  a  trapezoid  whose 
upper-to-lower  base  ratio  d/D  depends  on  the  actual 
array.  To  consider  the  effects  of  such  a  profile,  we  as¬ 
sume  a  detector  weighting  function: 


•It  2(n  -  1  PM) 
D-d  D-d 
1 

ll  „(  T)  =  '  27  j 

I)  -  d  + 

■Jin  +  1  /•>)!) 

D-d 


nD  -  DU  <  t  <  nD  -  d/2, 

nli  -  d/2  <  <  nD  +  d/2. 

(211 

nl)  +  d/2  <  t  <  nil  +  DU. 


where  n  is  the  detector  element  number.  I )  is  (he  lower 
base  of  the  trapezoid,  and  d  is  the  upper  base.  We  have 


chosen  to  use  this  weighting  function  (and  the  variable 
delay  t0)  for  its  versatility  in  studying  the  effects  of  the 
detector  element’s  profile  on  the  system’s  performance. 
For  example,  for  d  =  D,  wn(r)  describes  a  rectangular 
profile,  whereas  d  =  0  describes  a  triangular  profile. 
For  any  other  d  and  D  relationship,  (c„(t)  is  a  trape¬ 
zoid. 

Let  us  assume  that  the  correlation  peak  lies  within 
the  n  =  0  detector  element,  i.e.,  -D/2  <  t()  <  D/2,  then 
w0(t)  defines  the  profile  of  the  detector  element  in 
which  the  correlation  peak  occurs.  The  average  peak 
power  in  Eq.  (6)  then  becomes 

UD/2  12 

u>0(t)Rs(t  -  r0)dr  •  (22) 

■D/2 

Substituting  Eq.  (21)  into  Eq.  (22)  yields  the  expression 
for  the  average  correlation  peak  power  /pp.  We  evalu¬ 
ated  this  for  different  d/D  ratios  and  found  that  a  rec¬ 
tangular  detector  gave  the  best  /pp  value.  The  loss  in 
/pp  was  25%  when  To  =  D/2  rather  than  t0  =  0.  A 
trapezoidal  detector  response  profile  with  d  =  0.6D  gave 
30%  less  / pp,  and  a  triangular  detector  response  profile 
gave  70%  less  /pp.  To  analyze  wn  (t)  effects  on  SNR,  we 
assumed  t0  =  0  (to  simplify  the  analysis),  since  then  the 
variance  of  SNR]  is  independent  of  t0  (with  wn  fixed), 
and  SNR]  and  SNR2  vary  with  r0  in  the  same  way  that 
1  pP  does  (for  low  SNR/).  For  SBR  =  °°,  we  evaluated 
SNR]  and  SNR>  as  a  function  of  D  and  BW«  for  t0  =  0 


10  100  1000  (ym) 


Kit;.  7.  K  fleet  of  detector  weighting  profiles  on  !\\  (fur  =  fl.WWI 
;«s  11  tune!  it<n  of  t  he  detector  element  size  I). 
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for  different  ir„(r)  profiles.  We  found  that  both  tri¬ 
angular  and  trapezoidal  wn(r)  profiles  gave  better 
output  SNR  values  than  did  a  rectangular  (r ).  This 
is  expected  for  r0  =  0  since  these  wn  reduce  the  noise 
more  than  the  peak  value.  The  SNR  improvements 
increased  as  BWS  or  D  increased  (since  the  peak  nar¬ 
rows,  and  the  weighting  reduces  the  noise  more  than  the 
peak  value).  This  is  less  time  if  SNR/  is  larger. 

In  Fig.  7  we  show  ni?A  vs  D  for  these  three  detector 
element  profiles  for  the  case  ro  =  0.  We  find  that  the 
nonrectangular  profiles  perform  best,  with  only  a  small 
improvement  in  SNR/  (0.5  dB)  and  in  Pp \  =  0.00037 
(triangular),  Pfa  =  0.00044  (trapezoidal)  and  Pfa  = 
0.00125  (rectangular)]. 

To  compare  output  SNR  values  for  r(l  ^  0  for  dif¬ 
ferent  (c„(r),  we  consider  the  worst-case  r()  =  D/2.  We 
compute  the  /pp  loss  (due  to  r0  =  D/2  compared  with  r() 
=  0)  and  from  this  subtract  the  output  SNR/  improve¬ 
ment  for  r0  =  0  (due  to  the  use  of  a  nonrectangular  vs 
a  rectangular  detector  element  spatial  response  profile). 
From  this  analysis,  we  find  that  trapezoidal  and  trian¬ 
gular  profiles  give  ~0.5  dB  better  SNR-/  than  a  rectan¬ 
gular  profile. 

Thus  it  appears  that  the  simpler  rectangular  detector 
response  profile  model  can  be  used  (thus  greatly  sim¬ 
plifying  the  analysis)  with  only  small  efforts  of  the  SNR 
or  Pfa  to  be  expected.  The  Pfa  results  actually  ob¬ 
tained  are  expected  to  be  slightly  better  than  those 
predicted  by  the  simplified  w„(t)  theory. 

VIII.  Experimental  Verification 

Initial  experimental  results  obtained  on  a  laboratory 
T1  AO  correlator  for  signal  detection  are  now  reported. 
A  matching  pair  of  TeO/  cells  was  used  for  the  point 
modulator  and  delay  line.  The  center  frequency  of  both 
cells  was  35  MHz.  Each  cell  was  operated  in  the  linear 
intensity  mode.  The  cells  were  biased  at  12  V  and  op¬ 
erated  with  a  signal  level  of  6  V  (i.e.,  SBR  =  0.5).  No 
additive  noise  was  introduced,  and  thus  SNR/  =  °°. 
Although  the  cell  bandwidth  could  accommodate  20- 
MHz  data,  we  could  not  produce  a  signal  of  such 
bandwidth  because  of  equipment  limitations.  Thus  we 
used  a  Gaussian-distributed  signal  (from  a  noise  gen¬ 
erator)  with  a  Gaussian  autocorrelation  function  and 
a  BW.s-  =  0.5  MHz.  The  width  of  the  correlation  peak 
for  this  signal  is  4  psec  with  the  1:4  imaging  system  used. 
To  measure  SNR|  and  SNR/,  we  used  a  single  detector 
element  with  D  =  200  pm  and  an  integration  time  T 1  = 
5  msec.  D  =  200  pm  corresponds  to  1.28  psec  or  about 
one-third  of  the  width  of  the  correlation  peak.  No  ad¬ 
ditional  system  errors  were  introduced  by  the  band¬ 
width  of  the  AO  cells  and  the  phase  response  of  the 
transducers  over  this  small  BW.s-.  This  experimental 
setup  thus  allowed  detector  size  effects  alone  to  be 
studied  (with  all  other  error  sources  reduced  to  negli¬ 
gible  levels).  To  study  the  effect  of  D  on  system  per¬ 
formance.  we  inserted  a  variable  detector  aperture  of 
size  D  in  front  of  the  detector  element  and  varied  the 
aperture  (and  hence  D)  in  one  dimension  from  50  to  200 
pm.  The  height  of  the  slit  was  kept  constant  (at  100 
pm)  as  its  width  was  varied. 


To  measure  SNR!,  we  fixed  D  and  centered  the  cor¬ 
relation  peak  at  the  center  of  the  detector  element  (r0 
=  0).  Since  D  =  50-200  pm  corresponds  to  0.32-1.28 
psec,  which  is  less  than  the  4-psec  width  of  the  corre¬ 
lation  peak,  negligible  errors  are  introduced  by  slight 
mispositioning  of  the  detector.  Two  hundred  separate 
measurements  of  the  detector’s  output  were  taken. 
(The  noise  or  statistical  fluctuations  were  different  in 
each  measurement,  and  thus  these  data  constituted  a 
different  sample  realization  of  the  random  correlation 
process.)  For  each  choice  of  D,  the  mean  and  variance 
of  these  200  samaples  were  calculated  and  their  ratio 
calculated  to  provide  our  desired  SNR). 

To  obtain  SNR/,  we  measured  the  value  of  the  cor¬ 
relation  at  the  peak  and  far  from  the  peak.  To  achieve 
this,  we  moved  the  detector  element  far  (~15  psec)  from 
the  peak  location.  Image  plane  detector  difference 
errors  were  negligible,  since  the  same  detector  was  used 
for  measurements  both  at  the  peak  and  far  from  the 
peak.  To  reduce  the  effects  of  input  light  uniformity, 
AO  cell  attenuation,  and  AO  cell  spatial  response  vari¬ 
ations,  we  measured  the  correlator’s  output  with  no 
signal  present  (i.e.,  with  only  the  carrier  present)  and 
selected  two  output  locations  for  our  SNR/  measure¬ 
ments  where  the  light  level  was  equal  within  5%.  To 
facilitate  a  uniform  output  (i.e.,  negligible  spatial 
weighting  due  to  the  cell,  acoustic  attenuation),  we  used 
a  spatial  filter  in  the  frequency  plane  that  reduced  AO 
cell  nonuniform  response  variations.  From  our  200 
measurements  of  the  correlation  output  far  from  the 
peak,  we  obtained  estimates  of  the  correlation  noise 
level  and  hence  SNR/  experimental  data.  We  repeated 
this  procedure  for  different  D  values  of  50, 100, 150,  and 
200  pm,  corresponding  to  samplings  of  50  to  12  sam¬ 
ples/correlation  peak  width.  For  each  case  we  obtained 
200  measurement  samples.  To  ensure  that  the  slit  was 
centered  in  the  middle  of  the  detector  element,  we  used 
a  scanning  microscope.  This  also  insured  us  of  the 
exact  slit  or  detector  width  D  used. 

Our  experimental  data  and  the  theoretical  results 
obtained  from  our  theory  for  Gaussian  autocorrelation 
function  signals  are  shown  in  Fig.  8.  The  theoretical 
SNRi  and  SNR/  values  were  obtained  from  Eqs.  (9)  and 
(10)  using  Gaussian  Rs(t)  and  R„(t)  functions.  The¬ 
oretically,  we  expect  a  SNR]  of  24.5  dB  (for  D  =  10  pm) 
and  a  SNR/  22  dB  (for  D  =  10  pm).  We  expect  a  con¬ 
stant  2.5-dB  difference  in  these  SNR  measures  with  the 
slight  decrease  shown  for  SNR  as  D  is  increased.  Our 
experimental  data  (Fig.  8)  are  in  rather  good  agreement 
with  theory.  Both  SNR  values  are  within  2  dB  of  the 
theoretical  values.  SNR/  is  larger  than  SNR|  as  pre¬ 
dicted  by  theory,  with  the  difference  (2.2  dB)  being  very 
close  to  theory  (2.5  dB).  Our  experimental  results  show 
that  both  SNR  data  remain  approximately  constant  for 
D  values  between  50  and  200  pm  (as  predicted  by  our 
theoretical  analysis).  Since  both  of  our  experimental 
SNR  values  are  less  than  the  theoretical  ones  by  2  dB. 
furt  her  credence  is  given  to  our  data.  Possible  reasons 
for  (he  2-dB  SNR  difference  (loss)  are  detector  noise 
and  background  optical  noise. 
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Fig.  8.  Theoretical  and  experimental  data  on  the  effect  of  the  de 
tector  element  size  D  on  SNR]  and  SNR2. 


Valid  Pp  and  Pfa  measurements  would  require  more 
than  1000  samples  (to  observe  Pfa  =  0.001  or  1  peak 
that  exceeds  the  threshold  in  100  measurements). 
Since  our  results  verified  the  validity  of  our  analysis  of 
the  effect  of  D  on  SNRi  and  SNR2,  the  conventional 
relationships  between  Pp  and  Pfa  and  SNR  should  be 
valid.  An  advanced  experimental  verification  would 
require  use  of  a  higher  BW s  signal,  a  larger  number  of 
samples,  more  control  over  the  D  setting  (i.e.,  smaller 
and  more  accurate  slit  widths),  a  more  sensitive  detec- 
tor,  etc.  We  note  that  under  such  conditions  one  could 
verify  the  effects  of  correlation  location  and  spatial 
weighting  function.  In  our  experiments,  verification 
of  the  correlation  peak  location  with  the  available 
equipment  was  not  possible  because  of  the  broad  cor¬ 
relation  peak  obtained  with  the  available  signal  BW.S- 
equipment. 

IX.  Summary  and  Conclusions 

In  this  paper,  we  have  studied  the  effects  of  detector 
errors  on  the  performance  of  an  acoustooptic  time- 
integrating  correlator  used  for  signal  detection.  In  our 
analysis,  we  modeled  the  system’s  output  to  include  the 
effects  of  various  detector  parameters  such  as  area  in¬ 
tegration,  elemental  spatial  weighting,  and  correlation 
peak  location  within  a  detector  element.  As  perfor¬ 
mance  measures  we  used  Pp  and  P fa  and  in  our  theory 
derived  expressions  for  Pp  and  Pfa  in  terms  of  detector 
parameters  and  the  easily  measured  SNRi  and  SNR/ 
output  correlation  parameters. 

To  study  the  various  detector  effects,  we  performed 
a  general  statistical  analysis  and  quantified  our  results 
for  the  case  of  signals  and  noise  with  a  Gaussian-Markov 
autocorrelation  function.  This  provided  us  with  ana¬ 
lytical  results  which  fully  describe  the  system's  per¬ 
formance  as  a  function  of  various  system,  signal,  and 
detector  parameters.  From  these  expressions,  we 


quantified  the  system  performance  as  a  function  of  the 
various  detector  error  sources  and  system  and  signal 
parameters  in  our  model. 

We  found  that  area  integration  resulted  in  a  variety 
of  effects  such  as  a  degradation  in  both  Pp  and  Pfa  as 
the  detector  element  size  D  increased.  We  found  that 
T 1  and  SNR/  effects  were  more  dominant  and  that  an 
increase  in  either  (if  possible)  was  more  significant  than 
a  decrease  in  the  detector  area  D.  Thus  system  design 
considerations  dictate  an  increase  in  Tj  or  SNR/  (if 
possible)  to  compensate  for  losses  due  to  the  finite  de¬ 
tector  element  size  D.  Our  studies  of  the  signal  band¬ 
width  BWS  effects  showed  that  both  Pp  and  Pfa  are 
more  sensitive  to  D  (when  BWS  is  large)  but  that  the  Pp 
and  Pfa  values  obtained  in  this  case  were  quite  good. 
Thus  such  issues  appear  to  be  of  more  concern  when  the 
signal  time  bandwidth  product  is  moderate  (i.e.,  500- 
1000).  The  effect  of  the  SBR  and  D  was  quantified  and 
found  to  be  the  most  important  and  dominant  effect  on 
the  choice  of  D  in  a  system  design.  It  was  shown  that 
even  for  the  maximum  possible  SBR  value  of  0.5  (for 
linear  intensity  modulated  AO  cells),  the  system’s 
performance  degrades  significantly  as  D  increases. 

The  effect  of  the  location  of  the  correlation  peak 
within  a  detector  element  was  also  studied.  F rom  this 
analysis,  we  found  that  the  loss  encountered  as  the 
correlation  peak  location  departed  from  the  center  of 
the  detector  element  depended  on  both  the  D  and  BWs 
values.  These  effects  were  quantified.  In  general,  the 
system’s  performance  degraded  as  either  the  delay  or 
D  increased,  with  the  loss  becoming  more  significant  as 
BW</  increased.  The  system’s  designer  must  select  D 
from  the  Pp  and  Pfa  values  obtained  when  the  corre¬ 
lation  peak  is  located  at  the  edge  of  the  detector  ele¬ 
ment.  As  shown  this  requires  detector  element  sizes 
much  less  than  the  Nyquist  value  (for  the  correlation 
peak  width)  to  achieve  good  Pfa  performance. 

To  study  the  effects  of  the  detector  element’s  spatial 
response,  we  conducted  a  simplified  but  well-approxi¬ 
mated  statistical  analysis.  In  this  analysis,  we  used  a 
spatial  response  model  that  varied  to  include  triangular, 
trapezoidal,  and  rectangular  detector  response  func¬ 
tions.  Our  analysis  was  conducted  under  the  assump¬ 
tion  of  a  low  input  SNR.  For  this  case,  we  found  that 
a  triangular  profile  enhanced  performance  (since  it 
suppressed  the  out-of-plane  noise  more  than  the  signal). 
In  practical  cases,  the  detector  element’s  spatial  re¬ 
sponse  is  trapezoidal,  and  the  improvement  (over  a 
rectangular  response)  was  found  to  be  small  (of  the 
order  of  0.51  dB  in  SNRi  or  SNR?).  Thus  future  sta¬ 
tistical  analyses  do  not  seem  to  require  elaborate  ap¬ 
proximations  of  the  detector’s  spatial  response  by  a 
trapezoidal  function. 

Our  experimental  work  verified  several  of  our  theo¬ 
retical  results  (specifically  the  validity  of  our  theoreti¬ 
cally  predicted  difference  between  SNR|  and  SNR-_>). 
The  observed  dependence  of  SNR)  and  SNRj  on  D 
appears  to  be  in  very  good  agreement  with  our  theory. 
In  all  cases,  our  experimental  results  were  in  agreement 
(within  10%)  with  our  theoretically  predicted  perfor¬ 
mance. 
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4.  AN  OVERVIEW  AND  SUMMARY  OF 
OPTICAL  PATTERN  RECOGNITION 
RESEARCH  USING  FEATURE 
EXTRACTORS  AND  CORRELATORS 
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HYBRID  OPTICAL/DIGITAL  IMAGE  PATTERN  RECOGNITION:  A  REVIEW 
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ABSTRACT 

The  parallel  processing,  high-speed,  compact  system  fabrication  possibility,  low  power 
dissipation  and  size,  plus  weight  advantages  of  optical  processors  have  achieved  great 
strides  in  recent  years.  The  architectures,  algorithms  and  system  fabrication  of  hybrid 
pattern  recognition  processors  are  reviewed  with  attention  and  emphasis  to  recent  results 
and  to  techniques  appropriate  for  distortion-invariant  multi-class  pattern  recognition 
applications . 


1 .  INTRODUCTION 

The  parallel  processing  advantages  of  optical  pattern  recognition  (OPR)  systems  have  long 
been  recognized.  However,  only  recently  have  components,  architectures,  algorithms  and  a 
commitment  to  fabrication  of  such  systems  emerged.  As  a  result,  this  topic  has  seen  an 
explosion  of  conferences  and  research  in  recent  years.  Several  recent  reviews  by  the  author 
exist  [1-3]  and  will  be  summarized  in  this  present  paper  with  attention  and  emphasis  on  more 
recent  work  than  those  noted  in  earlier  reviews.  Advances  in  laser  diode  and  detector  tech¬ 
nology  and  the  commitment  of  several  companies  (General  Dynamics-Pomona ,  ERIM,  Grumman)  and 
funding  agencies,  have  now  made  fabrication  of  such  processors  and  the  reduction  of  research 
to  systems  a  realitv.  Spatial  light  modulator  (SLM)  technology  is  summarized  in  [5]  and 
is  not  discussed  herein.  These  real-time  devices  still  represent  the  major  obstacles  to  the 
widespread  low-cost  commercial  exploitation  of  OPR  systems.  However,  the  future  for  this 
aspect  of  OPR  is  quite  promising.  Recent  Soviet  work  in  this  area  has  been  most  significant 
[81].  These  and  several  U.  S.  programs  have  concentrated  on  practical  SLM  device  tech¬ 
nology.  Many  linear  algebra  operations  are  required  in  OPR  [4]  and  are  discussed  elsewhere. 
Thus,  the  present  text  assumes  a  familiarity  by  the  reader  with  feature  extraction  and  such 
operations.  The  availability  of  two  computer  generated  hologram  (CGH)  recorders  has  been  a 
significant  adjunct  to  research  and  to  the  fabrication  of  OPR  systems  [6] .  A  general  pur¬ 
pose  approach  to  optical  computing  (presently  directed  at  signal  processing  rather  than 
image  processing)  is  the  use  of  optical  linear  algebra  processors.  These  approaches  and 
systems  are  also  summarized  elsewhere  [7]  and  are  not  discussed  in  this  present  paper.  The 
various  SPIE  [8]  and  IEEE  [9]  special  issues  on  digital  imaqe  processing  attest  to  the 
significant  importance  of  this  topic  and  the  growing  number  of  OPR  papers  in  these  references 
signifies  the  importance  of  this  topic. 

In  this  present  review,  I  restrict  attention  to  OPR  algorithms  and  architectures  for 
pattern  recognition  rather  than  image  processing  (i.e.  image  enhancement,  restoration,  etc.). 
To  those  authors  whose  work  is  not  referenced  herein,  I  apologize  and  plead  a  lack  of  time 
and  space  bandwidth  product.  Emphasis  will  be  given  to  work  at  the  Center  for  Excellence  in 
Optical  Data  Processing  at  CMU ,  because  of  my  familiarity  with  it  and  because  of  the  large 
scope  of  its  research  in  the  area  of  optical  pattern  recognition.  To  best  unify  the  large 
volume  of  research  work  in  OPR,  I  first  review  the  basic  operations  achievable  in  optical 
systems,  two  classic  OPR  architectures,  and  conventional  feature-based  pattern  recognition 
(Section  2) .  Various  optical  architectures  for  feature  extraction  are  then  reviewed  and 
discussed  and  results  obtained  on  these  system  concepts  and  their  present  status  are  then 
advanced  (Section  3) .  Various  new  correlator  approaches  to  distortion-invariant  OPR  are 
then  briefly  reviewed  together  with  optical  AI/IU  research  and  sub-pixel  target  identifica¬ 
tion  research  (Section  4) .  SDF  techniques  to  achieve  various  distortion-invariant  3-D  object 
recognition  are  then  detailed  with  attention  to  new  results  and  efficient  phase-only  and  CGH 
techniques  to  synthesize  such  filters  (Section  5) .  Section  6  is  devoted  to  system  fabrica¬ 
tion  issues  with  attention  to  new  results  and  to  flight-tests  on  compact  architectures  and 
systems  for  OPR.  Our  summary  and  conclusions  then  follow  (Section  7). 

2.  FEATURE-SPACE  OPTICAL  PATTERN  RECOGNITION  (OPR) 

2.1  OPERATIONS  ACHIEVABLE 

In  optical  processors,  2-D  data  (images)  are  represented  by  the  transmittance  of  a  2-D 
data  plane.  By  imaging  one  such  data  plane  through  another,  we  achieve  the  pomt-by-point 
multiplication  of  the  two  2-D  data  arrays.  A  lens  can  integrate  this  2-D  product  distribu¬ 
tion  (or  any  2-D  data  distribution)  and  thus  achieve  a  2-D  data  summation  ( 1  — D  data  summa¬ 
tions  are  also  possible  using  cylindrical  rather  than  spherical  lenses).  With  CGHs,  random 
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interconnections  between  2-D  data  arrays,  coordinate  transformations  and  other  space-varian 
operations  are  possible  [6).  Thus,  we  can  characterize  and  summarize  the  major  operations 
possible  on  optical  systems  as  2-D  parallel  data  multiplication  and  addition.  A  specific 
operation  that  has  been  the  hallmark  of  coherent  OPR  is  the  2-D  Fourier  transform  (FT). 

This  operation  is  readily-  achieved  with  a  simple  lens  or  mirror.  In  Figure  3,  the  2-D  linh 
amplitude  distribution  incident  on  P2  is  the  2-D  FT  G(u,v)  of  the  input  object  g(x,y)  place 
at  P j 

G(u,v)  =  //g (x,y) e--^  (ux+vy  j  dxdy ,  (1) 

where  the  spatial  frequencies  (u,v)  cf  the  input  object  are  related  to  distances  (xj.yj)  m 
Pj  by 

(u,v)  =  (x2/>.f  ,y2/lf,  )  ,  (21 

where  >.  is  the  wavelength  of  the  input  light  and  Fl  is  the  focal  length  of  L’  in  Figure  1 . 
if  we  place  a  filter  function  H*(u,v),  i.e.  a  matched  spatial  filter  (MSF) ,  at  P2,  then  the 
light  distribution  leaving  P2  is  the  2-D  data  product  distribution  G ( u , v ) H * ( u , v )  and  the  Py 
output  is  its  FT  or 

u<X3,y3)  =  jOl  G(u,v)H*(u,v) )  =  g  0  h .  (3) 

We  represent  FT  distributions  by  upper-case  letters  and  corresponding  space  functions  by 
the  corresponding  lower-case  letters.  The  symbol  denotes  the  FT  operator,  the  superscrip 
*  denotes  the  complex  conjugate  and  0  denotes  the  correlation.  The  system  of  Figure  1  is 
a  frequency  plane  correlator.  The  optical  correlation  of  two  2-D  images  can  also  be 
realized  in  a  joint  transform  correlator  by  forming  the  FT  of  the  magnitude  squared  of  the 
FT  of  the  two  functions.  To  synthesize  the  H*  complex  conjugate  transmittance  function 
required  in  (3),  holographic  techniques  are  used. 
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FIGURE  1 

Conventional  optical  Fourier  transform  and  frequency  plane  correlator 


2.2  CONVENTIONAL  FEATURE-SPACE  PATTERN  RECOGNITION 

The  conventional  digital  and  mathematical  literature  usually  considers  f eatur ;-space 
pattern  recognition.  In  this  method  (Figure  2),  a  set  of  M  image  features  are  calculated  ar.d 
an  N  x  N  pixel  image  is  represented  as  an  M-dimensional  feature  vector  x.  The  original 
feature  space  is  often  transformed  to  a  new  decision  space  as  ^  =  A  x  with  independent 
features  and  dimensionality  reduction.  The  axes  of  this  space  are  a  set  of  basis  functions 
it  1  and  the  elements  of  each  vector  y  are  the  projections  on  the  corresponding  £  vectors  th 
define  this  space.  A  discriminant  vector  wj  for  each  class  1  is  chosen  such  that  «iTv>w-'1 
for  all  j  4  1 .  From  the  projection  values,  the  class  of  the  input  object  is  determined. 

The  blocks  in  Figure  2  arc  chosen  to  best  define  subproblems  and  thus  we  need  not  trar.s'oro 
x  into  y  as  the  first  step,  and  then  project  ^  onto  w.  Rather,  we  can  project  x  onto  a  new 
transformed  linear  discriminant  function  vector  d^  ■  A  wa  for  the  class  1  data. 
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FIGURE  2 

Simplified  block  diagram  of  a  feature-based  pattern  recognition  system 


Most  intra-class  dimensionality  reduction  techniques  are  variations  of  the  Karhuner- 
Loeve  (KL)  expansion  [33]  in  which  the  elements  of  A^  in  the  transformation  matrix  noted 
earlier  are  the  eigenvectors  of  the  correlation  matrix  for  all  class  1  trainina  set  images. 
The  use  of  two  two-stage  KL  transforms,  how  the  means  of  each  data  class  are  handled,  the 
number  of  dominant  eigenvectors  used,  and  how  the  correlation  matrix  is  calculated  are  among 
the  different  versions  of  the  KL  algorithm  [34]  that  can  be  applied.  We  will  employ  K-L, 
Gram-Schmidt  (GS)  [35],  Fukunaga-Koontz  (FK)  [36]  and  Foley-Sammon  (FS)  [35]  techniques  in 
our  OPR  research.  KL  methods  yield  maximum  compression  of  data  and  an  orthogonal  basis 
function  set.  GS  methods  are  another  technique  to  produce  orthonormal  basis  functions, 
whereas  FK  and  FS  techniques  are  appropriate  for  inter-class  recognition  problems.  Various 
classifiers  used  include  nearest  neighbor,  nearest  mean  and  the  use  of  a  least-square  linear 
discriminant  function  (LDF)  wx .  These  are  among  the  more  popular  ones.  Our  major  concern 
in  this  present  paper  is  OPR.  Such  digital  PR  post-processing  algorithms  are  reviewed 
elsewhere  in  this  volume  [37]  and  in  other  OPR  references  by  the  author.  The  important 
points  to  emphasize  are: 

-the  concepts  of  a  feature-space,  basis  functions,  feature  vectors,  transformations  and 
dimensionality  reduction; 

-a  training  set  is  used  to  select  A  and  wa  and  this  operation  is  off-line; 

-the  only  required  on-line  operations  are  the  calculation  of  the  features ,  a  vector  inner 
product  and  the  associated  classifier  decisions. 

The  high  computational  load  associated  with  the  feature  generation  and  calculation  are  the 
major  ones  of  concern.  Thus,  we  concentrate  on  the  use  of  the  parallelism  of  optical  pro¬ 
cessors  to  achieve  these  functions  and  relegate  the  remaining  operations  in  Figure  2  to  a 
general-purpose  or  dedicated  digital  hardware  post-processor.  The  resultant  hybrid  optical/ 
digital  system  appears  to  perform  properly  in  each  instance.  It  also  appears  to  be  the 
optimal  combination  of  the  parallelism  of  optics  and  the  flexibility  and  decision  making 
advantages  of  digital  processors.  Optical  systems  using  CGHs  can  also  perform  the  required 
transformations  and  projections  directly  on  the  2-D  input  image  data.  The  coded-phase 
processor  [36]  is  one  method  to  achieve  this.  Examples  of  its  use  to  realize  FK  [39], 
least-squares  [40,41]  and  the  hoteling  trace  [42]  operations  have  also  been  recently  report¬ 
ed.  The  basic  concepts  in  these  optical  systems  is  to  determine  the  linear  combination 
filter  desired  for  each  class  (this  is  a  linear  combination  of  the  training  set  images) . 

This  LDF  (linear  combination  filter)  is  then  encoded  on  a  mask.  A  separate  encoding  is 
required  for  each  input  object  class.  The  projection  of  the  input  test  image  onto  each  dis¬ 
criminant  function  is  then  optically  produced  and  the  result  is  summed.  The  phase  of  the 
input  data  is  removed  to  allow  different  projections  to  appear  on  physically  different  de¬ 
tectors  in  the  output  plane  of  such  a  system.  The  detector  with  the  largest  output  then 
denotes  the  class  of  the  input  object.  Such  a  system  (as  presently  described  in  the  litera¬ 
ture)  is  not  shift-invariant  and  is  thus  best  described  as  a  feature-space  method.  If 
shifted  versions  of  each  input  object  are  included,  the  space  bandwidth  product  requirements 
of  the  associated  CGH  increase  linearly.  Such  methods  are  appropriate  for  achieving  shift- 
mvariance  of  such  a  system,  however  such  details  have  yet  to  be  published.  The  use  of 
optics  in  this  case  is  thus  most  attractive  when  there  are  a  large  number  of  classes  to  be 
searched.  However,  in  general,  the  vector  inner  product  operations  required  are  not  compu¬ 
tationally  intensive  unless  the  number  of  features  used  is  also  auite  large. 

3.  FEATURE -SPACE  OPTICAL  PATTERN  RECOGNITION  (OPR) 

In  this  section,  we  briefly  discuss  nine  different  optical  feature  extraction  or  genera¬ 
tion  systems  and  their  performance  and  status. 


3.1  FOURIER  COEFFICIENT  FEATURE  SPACE 

Since  the  FT  operation  is  automatically  performed  optically  (Figure  1),  this  is  an  obvic 
feature  space.  It  is  also  attractive  because  it  easily  allows  for  dimensionality  reduction 
The  rest  attractive  optical  dimensionality  reduction  method  is  to  detect  and  sample  the  optical 
FT  pattern  (plane  P2  of  Figure  3a)  with  a  detector  with  wedg<.  and  ring  shaped  detector 
elements  (Figure  3b).  This  concept  was  first  advanced  in  [11]  and  used  for  screening  of 
aerial  images  [11],  for  various  production  quality  inspection  tasks  [11],  and  with  an  im:.c._ 
and  wedge  ring  detector  (WED)  detect  ion  planes  for  aerial  image  classification.  The 
commercial  version  of  this  device  used  32  wedge  and  32  ring-shaped  detector  elements.  This 
achieves  dimensionality  reduction  from  N2  to  64  features.  Since  the  intensity  of  the  FT  is 
detected,  the  system  is  translation  invariant.  For  real  images,  the  FT  is  symmetric  ar.d  no 
information  loss  results  from  the  separate  use  of  the  two  halves  of  the  FT  plane.  The  w cm: 
outputs  F(~)  are  scale-invariant  and  their  distribution  shifts  as  the  input  ob-iect  rotates. 
Conversely,  the  ring  outputs  F(r)  are  rotationally-invanant  and  the  distribution  shifts  as 
the  input  object  is  scaled. 


A 

\J 

INPUT  FT  LINS 


(a)  (b) 

FIGURE  3 

Optical  Fount;  eocf  f icient  feature  space  processor  (a)  and  wedge  ring 
detector  optical  pattern  recognition  concept  (b) 


The  most  recent  pattern  recognition  work  on  this  feature  space  has  involved  realization 
of  this  detector  using  CGhs  [6],  This  allows  more  flexibility  and  lower  cost  and  size. 

The  optical  realization  of  this  unique  detector  plane  sampling  appears  essential  because  of 
the  large  time  required  to  digitally  perform  the  necessary  interpolations.  Recent  pattern 
recognition  tests  on  this  WRD  Fourier  coefficient  feature  space  were  performed  for  the 
purpose  of  distinguishing  letters  and  different  vehicles  [12).  Results  and  details  are 
available  elsewhere  [12].  The  highlights  of  this  work  were  attention  to  the  use  of  ampli¬ 
tude  versus  phase  Fourier  coefficient  features,  the  effect  of  noise,  and  investigation  of 
three  different  feature  extractor  algorithms,  and  demonstration  of  scale  and  rotation-invar 
iant  object  classification  and  recognition  using  such  a  feature  space.  However,  only 
limited  scaled  and  rotated  versions  of  the  input  objects  were  tested. 

3.2  WIGNER  DISTRIBUTION  FEATURE  SPACE 
The  Wigner  distribution  (WD)  function 

Wfg(t,w)  =  ft  (t  +  t/2)  g*  (t  -  t/2)  e'-’WTdT  (4) 

is  a  simultaneous  time  and  frequency  display  of  the  signal  data.  For  images,  the  WD  is  a 
4-D  display.  Auto  and  cross  WD  functions  can  be  defined  similar  to  (4).  The  WD  describes 
local  variations  in  the  frequency  control,  whereas  the  FT  provides  global  signal  frequency 
information .  Since  images  are  non-stationary ,  a  WD  feature  space  should  be  most  useful. 

One  can  optically  produce  a  WD  by  many  different  techniques.  An  attractive  method  (4)  uses 
the  FT  o f  the  product  of  the  data  in  two  AO  cells  at  ±45°.  A  binary  mask  (using  u  macnetc 

optic  SLM  [43])  allows  a  desired  sum  of  different  WD  features  to  be  achieved  on-line  cr.  a 

single  detector,  for  which  subsequent  pattern  recognition  analysis  is  then  greatly  simpli¬ 
fied.  This  is  essential  since  the  WD  of  a  1-D  function  is  a  2-D  pattern.  The  most  recent 

review  of  this  work  [44]  includes  an  SNR  comparison  of  the  optimality  of  WD  features  and 

initial  simulation  results.  For  pattern  recognition,  the  auto  WD  of  an  input  and  reference 
are  multiplied  and  integrated  over  time  and  space.  In  this  case,  the  mask  m  Figure  4 
would  contain  the  WD  of  the  reference ( s) .  This  appears  to  be  an  attractive  approach  for 
many  pattern  recognition  applications.  The  use  of  an  optical  processor  and  dimensionality 
reduction  technique  as  in  Figure  4  is  essential  because  of  the  increased  dimensionality  c f 
the  output  in  a  WD  feature  display.  Researchers  in  Germany  [46],  Wisconsin  [46]  and  CM'.' 

[44]  are  the  most  active  ones  in  this  pattern  recognition  research  area. 
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FIGURE  4 

Simplified  Wigner  distribution  feature  space  and  detection 
pattern  recognition  processor 

3.3  CHORD  DISTRIBUTION  FEATURE  SPACE 

The  chord  distribution  is  defined  for  a  binary  boundary  object  only  as  the  distribution 
h(r,S)  of  the  lengths  r  and  angles  6  for  all  chords  that  can  be  drawn  between  boundary 
points  on  the  object.  Denoting  boundary  image  points  by  o(x,y)  ■  1,  a  chord  defined  by  the 
polar  coordinates  r  and  6  exist  between  any  two  points  if 


g(x,y,r,6)  «  b  (x ,  y )  b  (x+r  cos  9  ,  y+r  sin  6 )  •  1. 


The  distribution  of  all  chords  in  the  image  is  the  integral  of  (5)  or  [21] 

h  *  ( r ,  6 )  =  j/g(x,y,r,6)dxdy  *  b(x,y)  ©b(x,y)  *  hi:x,£  ),  (6) 

where  (£x,Ey)  =  (rcos  f,rsin  f|  .  From  the  last  expression  in  (6),  we  see  that  the  chord 
distribution  can  be  obtained  from  the  autocorrelation  of  the  boundary  of  the  object.  The 
autocorrelation  h(ix,£y)  thus  contains  information  from  which  the  conventional  chord  distri¬ 
bution  h(r,0)  can  be  obtained,  however  complicated  trigonometric  and  square-root  calculations 
are  required  for  this  transformation.  This  feature  space  is  still  quite  useful  and  attrac¬ 
tive  [23,24]  except  for  the  large  computational  load  required  to  compute  these  features. 

In  [20] ,  we  first  noted  that  by  sampling  the  autocorrelation  of  the  object  with  a  wedge  ring 
detector,  the  chord  distributions 

h(r)  *=  /h  ( £  ,  Cl  rde ,  h(6)  -  /h(f.f  )rdr  (7) 

x  y  x  y 

could  be  obtained  directly.  As  before  (Section  3.1),  the  advantages  of  an  optical  KRD  are 
again  clearly  needed  to  achieve  this.  Nichols  [18]  later  also  noted  this  and  suggested  its 
calculation  from  a  digital  or  optical  FT  of  the  optical  power  spectrum  of  the  image.  The 
computational  load  in  the  interpolation  required  in  (7)  can  rapidly  become  excessive  however. 
Thus,  WRD  sampling  techniques  [20]  and  other  methods  of  optically  producing  the  autocorrela¬ 
tion  [21]  appear  preferable.  The  general  block  diagram  of  the  hybrid  optical/digital  chord 
feature  space  processor  we  consider  is  shown  in  Figure  5. 
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FIGURE  5 

Block  diagram  of  a  hybrid  optical/digital  chord  feature  space  processor 

Several  authors  [24]  have  expressed  concern  over  such  a  feature  space  and  its  use  for  the 
recognition  of  complex  objects.  However,  our  post-processing  algorithm  and  testing  [20-21] 
have  confirmed  the  usefulness  of  such  a  feature  space.  In  [21-22],  we  extended  the  tech¬ 
nique  in  ( 6 ) — ( 7 )  to  include  a  silhouette  image  of  an  object  with  internal  gray  levels. 

These  generalized  chord  distributions  that  result  from  such  a  feature  space  are  much  more 
useful  object  descriptors  than  the  original  binary  edge  chord  functions.  They  also  promise 
better  noise  performance  [22].  In  [19]  ,  Nichols  considered  the  case  when  the  dynamic  range 
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of  the  data  and  more  specifically  its  FT  causes  a  type  of  edge  enhancement  to  occur  in  the 
data  for  which  these  features  are  extracted.  In  [2]),  we  address  the  use  of  this  chord 
feature  space  for  the  classification  of  ships  in  the  presence  of  out-of-plar,e  rotational 
distortions.  For  this,  a  training  set  of  12-18  ships  per  class  was  used.  The  18  best  h(-) 
and  h(r)  features  were  selected  using  KL  and  divergence  measures  and  a  Fisher  LDF  was  com¬ 
puted  from  the  resultant  training  set  data.  Extensive  tests  [21]  showed  perfect  100%  recog¬ 
nition  performance  to  be  possible  on  separate  test  set  ship  imagery.  In  [22],  we  further 
extended  this  technique  to  include  in-plane  scale  and  rotational  distortion  invariance  and 
methods  to  extract  these  in-plane  distortion  parameters  from  the  resultant  feature  space 
data.  Initial  demonstrations  obtained  with  this  technique  were  mcst  attractive.  By 
properly  weighting  the  chords  of  different  lengths,  global  (large  r)  or  local  (small  r)  ob¬ 
ject  features  can  be  emphasized  or  a  weighted  combination  of  both  can  be  used  for  object 
identification . 

3 , 4  MOMENT  FEATURE  SPACE 
The  geometrical  moments 


s 


p _ 

r. 


mp^  =  fff (x,y)xpyqdxdy  (6) 

are  a  well-known  and  attractive  feature  space.  However,  the  computational  load  in  computing 
such  features  is  such  that  present  systems  are  restricted  to  the  calculation  of  moments  for 
binary  objects  or  for  the  computation  of  only  a  few  moments.  Various  techniques  to  optical¬ 
ly  compute  the  moments  of  an  input  object  exist.  These  include  the  use  of  computer  genera¬ 
ted  masks  [13],  a  holographic  mask  [14],  acousto-optic  (AO)  cells  [15]  and  moment  calcula¬ 
tions  from  1-D  projections  [28].  In  the  system  of  Figure  6,  the  image  f(x,y)  is  imaged 
through  masks  g(x,y)  at  P2  on  which  the  monomials  xPy3  are  recorded  on  different  spatial 
frequency  carriers.  The  products  ffx.ylxPy'l  are  formed  in  parallel  by  optical  multiplica¬ 
tion,  the  integration  is  achieved  by  the  output  lens  and  each  moment  in  (8)  is  formed  at  a 
different  detector  in  P3  (with  the  location  of  the  detector  determined  by  spatial  frequency 
carriers  on  the  mask).  In  this  way ,  all  21  moments  up  to  fifth  order  can  be  produced  opti¬ 
cally  in  parallel.  The  detector  outputs  are  then  fed  to  a  digital  post-processor  which  de¬ 
termines  the  class  of  the  input  object,  its  orientation  and  the  confidence  of  these  esti¬ 
mates  . 
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FIGURE  6 

Optical  system  to  generate  all  moments  in  parallel 


This  architecture  and  a  moment  feature  space  are  attractive  because  of  the  ease  with 
which  the  computed  moments  can  be  corrected  for  different  optical  system  and  SLM  error 
sources  [13].  A  compact  version  of  this  system  is  under  design  together  with  alternate 
ways  to  optically  produce  moments.  This  technique  has  been  successfully  demonstrated  in 
the  classification  of  real  ship  images  using  very  modest  digital  preprocessing  operations 
[47]  and  in  successfully  and  accurately  estimating  the  in-plane  distortion  parameters  of 
ship  imagery  [48]  .  The  full  hybrid  pattern  recognition  system  using  this  feature  space  is 
shown  in  block  diagram  form  in  Figure  7.  It  consists  of  a  first-level  estimator  that  pro¬ 
vides  class  and  aspect  estimates.  The  hierarchical  tree  search  used  [17]  is  unique  because 
the  classes  separated  at  each  node  are  selected  automatically  using  a  multi-class  Fisher 
projection  method  and  because  the  discriminant  vector  used  at  each  node  is  selected  auto¬ 
matically  from  a  separate  two-class  Fisher  selection  technique.  As  always,  these  off-line 
operations  are  performed  on  training  set  data  and  the  only  on-line  operations  required  .ire 
the  vector  inner  products  (one  per  node  in  the  tree) .  Aspect  estimates  are  obtained  from 
the  ratio  U20^02  rati°  obtained  from  the  computed  central  moments  of  the  input  test  object. 
An  iterative  nonlinear  algorithm  is  then  applied  to  these  classes  and  aspects  are  passed 
from  the  first-level  estimator.  The  final  classification  and  orientation  of  the  object  is 
then  obtained  in  the  second-level  classifier.  The  algorithm  used  in  the  second-level  clas¬ 
sifier  is  the  minimum  error  Bayesian  classification  algorithm.  This  is  possible  because  the 
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l.oments  are  jointly  Gaussian  random  variables.  Extensive  tests  have  been  conducted  on  a  set 
of  180  images  of  ships  [17]  in  five  different  classes  with  36  different  aspect  views  per 
ship  and  for  a  data  base  of  324  pipe  parts  [16]  in  five  different  groups  with  36  different 
aspect  views  for  each  of  9  different  pipe  objects.  These  tests  showed  excellent  performance 
(86%  correct  recognition  of  all  ship  images,  98%  correct  recognition  of  ship  views  within 
50°  of  broadside,  and  97%  correct  pipe  classification).  These  tests  used  only  4-9  different 
aspect  views  per  class  for  training,  required  only  4-6  iterations  in  the  Bayesian  classifier 
nonlinear  algorithm,  and  showed  that  the  use  of  the  identity  matrix  as  a  valid  approximation 
to  the  covariance  matrix  was  adequate.  These  issues  greatly  reduce  the  computational  load 
required  on  the  digital  post-processor.  Details  of  this  system  and  these  results  are  avail¬ 
able  elsewhere  [16,17],  These  tests  on  full  3-D  distorted  imagery  using  only  a  limited 
training  set,  and  a  large  test  set  have  also  been  applied  and  verified  on  real  imagery. 

This  makes  such  a  feature-space  pattern  recognition  technique  appear  most  attractive  and 
demonstrates  a  clear  role  for  optical  processors  in  feature  extraction  based  pattern  recog¬ 
nition  algorithms. 


FIGURE  7 

Hierarchical  hybrid  optical/digital  moment  feature-space 
pattern  recognition  architecture 


3.5  HOUGH  TRANSFORM  FEATURE  SPACE 

The  Hough  transform  (HT)  has  recently  received  considerable  attention  and  interest  in 
digital  image  processing  because  of  its  robustness  and  the  ability  to  implement  it  on 
pyramid  digital  architectures  using  simple  histogram  and  accumulation  operations  only.  Both 
coherent  [27]  and  non-coherent  [25,26,28]  optical  architectures  to  compute  the  HT  have  beer, 
detailed  and  demonstrated.  The  HT  maps  each  line  in  an  image  into  a  point  in  a  (p,8) 
feature  space,  where  p  is  the  perpendicular  distance  from  the  origin  to  the  line  and  9  is 
the  angle  the  line  makes  with  the  x  axis  [31].  This  technique  has  been  generalized  [30], 
extended  to  curve  detection  [32]  and  its  similarity  to  an  MSF  noted  [32]  .  In  the  optical 
realization  of  this  transformation,  the  equivalence  of  a  radon  transform  (RT)  and  HT  is  used 
[29].  The  RT  is  defined  as 

00 

f ( p , 6)  «  // f(x,y)f(p-  x  cos  e-y  sin  6) dxdy .  (9) 

—00 

This  is  equivalent  to  the  projection  of  {  onto  a  line  p  normal  to  the  angle  6.  We  denote 
f  at  one  6  by  fa  and  the  full  2-D  HT  by  f.  To  provide  insight,  we  note  that  a  point  (x0,y0) 
in  f(x,y)  is  a  sinusoid  in  f(p,6)  space  described  by 

p  =  xQ  cos  6  +  yQ  sin  6  .  (10) 

The  RT  is  equivalent  to  the  HT  with  the  sinusoid  weighted  by  the  value  (intensity)  of  the 
<xo,yo>  pixel  point  in  f(x,y).  In  this  feature  space  and  transformation,  a  line  in  (x,y)  is 
a  point  in  (.-,5),  a  curve  is  a  set  of  points  in  (r,?),  etc.  Thus,  an  object  composed  of 
lines  is  described  by  a  distribution  of  points  in  the  (:,8)  Hough  space. 

A  noncoherent  architecture  [28]  to  realize  the  RT  or  HT  is  shown  in  Figure  8.  In  this 
Bimple  system,  the  1-D  integration  (projection)  of,f(x,y)  is  performed  by  a  cylindrical  lens  and 
f„  is  produced.  The  angle  8  is  varied  different  fo  projections  at  different  8  produced  and  the 
f(r,5)  distribution  produced  by  placing  a  rotating  Dove  prism  behind  the  input  object. 
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Kith  a  modest  500  rpm  rotation  rate,  one  f-(.)  slice  of  f  is  produced  every  60  usee  and  a 
full  f  pattern  at  TV  frame  rates.  To  employ  an  HT  feature  space  for  pattern  recognition, 
the  HT  of  the  input  test  object  is  compared  to  the  HT  of  the  different  reference  objects  by 
whichever  feature  extractor  and  classification  technique  (Section  2.2)  one  desires. 
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FIGURE  8 

Optical  system  to  compute  the  radon  transform  or 
Hough  transform  by  1-D  image  projections  [28] 


3.6  OTHER  FEATURES  FROM  THE  RADON  TRANSFORM 

As  noted  above,  the  HT  feature  space  is  equivalent  to  the  RT  features  generated  or.  the 
system  of  Fj.g„ie  8.  Smoothing  of  the  input  image  u.id  converting  edges  into  lines  is  re¬ 
quired  and  possible  by  convolution  with  Gaussian  and  edge  operators.  By  the  central  slice 
theorem,  the  1-D  FT  of  ft  is  the  2-D  FT  of  f  evaluated  along  the  line  at  6.  By  the  filter 
theorem,  the  1-D  convolution  in  o  at  each  6  for  i  ar.d  a  reference  function  g  is  a  slice 
through  the  2-D  convolution  of  f  and  g  at  6 .  Thus,  all  necessary  2-D  filtering  operations 
are  possible  on  projection  vectors  with  1-D  operators.  Conventional  AO  FT  and  convolvers 
can  easily  achieve  this  at  the  60  usee  rates  needed  (or  even  faster  if  required) . 

Many  other  features  can  also  be  obtained  from  this  RT  output  [28],  If  the  1-D  f.  outputs 
from  Figure  8  are  fed  each  T^  =  60  usee  to  an  acousto-optic  (AO)  spectrum  analyzer,  their 
FT  is  produced.  One  large  area  detector  covering  half  of  the  FT  plane  produces  a  wedge 
sample  FI?)  of  the  FT  of  f  each  T*.  A  linear  detector  array  with  an  integration  time  NT^  in 
the  other  half  of  the  FT  plane  yields  the  FT  ring  samples  F ( o )  each  NTft.  Thus,  a  WRD  FT 
feature  space  results.  The  moments  mne  (the  n-th  moment  of  f  about  0)  can  be  computed  and 
related  to  the  conventional  mpg.  However,  one  can  also  simply  compute  the  first  ten  mpc 
from  only  four  projections  [28].  At  CMU ,  we  often  prefer  to  use  the  features  directly 
rather  than  converting  them  into  mpq  features  with  a  loss  of  information. 

From  two  orthogonal  projections  90°  apart,  the  convex  hull  rectangular  boundary  of  any 
object  can  be  determined.  With  N  projections,  an  N-order  polygon  defining  the  object 
boundary  can  be  obtained.  The  projection  widths  versus  0  results  in  a  1-D  feature  vector 
w(6).  This  or  its  FT  can  be  used  for  object  identification. 

Polar  projections  of  the  integral  through  the  centroid  (x  %)  of  the  object  as  a  function 
of  6  are  another  useful  descriptor  of  the  object  shape.  If  each  projection  £g  is  evaluated 
at  the  one  point 


o(9)  “  xcos9  +  ysin6, 


(11) 


IT  ‘ 


then  the  1-D  feature  vector  s(6)  results.  This  is  similar  to  a  chord 
for  chords  through  the  centroid  (the  centroid  is  easily  obtained  from 
analogous  to  older  polar  space-variant  optical  transform  work  without 
Mellm  transform  properties  of  this  earlier  optics  research  [49]. 


transform,  but  only 
moments) .  It  is  also 
the  scale-invariant 


3.1  AUTOCORRELATION  OBSERVATION  SPACE 

The  shape  and  distribution  of  the  autocorrelation  of 
information  useful  for  object  recognition.  The  general 
analyze  such  an  observation  space  is  shown  in  Figure  9. 
considered  many  different  sampling  methods  and  features 
observation  space.  These  digitally-calculated  features 


an  input  object  contains  significant 
architecture  for  a  processor  to 
In  recent  work  [50],  Merkle  has 
calculated  from  an  autocorrelation 
(computed  from  an  optically-produced 
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autocorrelation  pattern)  may  require  extensive  time.  The  features  considered  include  cor: 
tour  features  such  as  chain  codes  and  Fourier  descriptors,  various  histogram  operators,  m 
ments  of  the  autocorrelation  function,  etc.  A  large  set  of  tests  on  different  characters 
was  performed  and  the  results  obtained  using  different  features  were  compared. 
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FIGURE  9 

Block  diagram  of  a  hybrid  optical  /  digital  autocorrelation 
observation  space  pattern  recognition  processor  [50] 

3 ■ B  OTHER  FOURIER  TRANSFORM  FEATURE  SPACES 

In  recent  work,  the  use  of  CGHs  and  HOEs  to  optically  realize  various  special  sampling 
functions  (such  as  wedge  ring  detection)  has  been  considered  and  experimentally  demonstrated 
(6).  In  other  FT  observation  space  research,  Duvernoy  [84]  considered  isoenergy  contours 
in  the  FT  plane.  He  computed  the  Fourier  descriptors  for  such  contours  at  various  levels. 
Several  basis  function  analyses  techniques  were  used  to  classify  various  types  of  terrain 
(woods,  fields  and  cities)  .  These  isoenergy  contours  are  attractive  because  they  combine 
spatial  frequency  as  well  as  directional  information.  The  spatial  frequency  and  directional 
information  is  also  available  from  wedge  ring  detector  outputs,  however  a  wrd  space  provides 
this  information  separately,  not  combined  as  in  an  isoenergy  contour  analysis. 

3.9  DISCUSSION 

As  one  can  easily  see,  there  is  significant  new  research  on  optically  generated  features 
and  feature  extractors.  These  new  advances  allow  many  different  observation  spaces  to  be 
used.  The  optical  generation  and  calculation  of  all  major  feature  spaces  has  been  demon¬ 
strated  and  described.  The  attractive  aspects  of  this  research  include: 

(1)  The  same  architecture  can  compute  the  features  for  any  input  object.  Thus,  a  new 
architecture  is  not  necessary  for  a  new  object  identification  problem. 

(2)  The  parallelism  of  optics  in  feature  generation  and  the  flexibility  of  digital 
feature  extractors  and  classifiers  are  matched  quite  well  in  these  architectures. 

(3)  A  different  discriminant  vector  w  or  feature  extractor  algorithm  can  easily  be 
included  in  the  digital  post-processor, should  a  given  object  identification  data 
base  necessitate  this. 

(4)  Dimensionality  reduction  can  and  has  been  employed  to  reduce  the  calculations 
required  by  the  digital  post-processor.  Alternatively,  the  projections  and 
transformations  can  be  optically  implemented  if  desired  using  computer 
generated  holograms. 

The  shortcomings  of  these  or  any  digital  or  analog  feature  extractor  for  pattern  recognition 
include: 

(1)  A  higher  susceptibility  to  noise.  This  is  a  direct  result  of  dimensionality 
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reduction . 

(2)  The  need  to  segment  the  input  object  into  interesting  candidate  regions  before 
feature  extraction.  We  refer  to  pattern  recognition  architectures  capable  of 
handling  multiple  object  simultaneously  as  shi ft- invar lant  (e.g.,  a  correlator). 

4 .  RECENT  OPTICAL  CORRELATOR  ADVANCES 

4.1  MULTIPLE  HSF  CORRELATORS 

Several  advances  in  optical  correlators  are  briefly  reviewed  in  this  section.  The  mitia 
description  of  Figure  1  assumed  a  single  MSF  at  ?2,  however  multiple  MSFs  are  also  j. -ss-i.it 
[51  —  53]  .  In  these  cases,  spatial  and/or  frequency-multiplexing  of  the  MSFs  are  used.  In 
the  space-multiplexed  case,  the  FT  of  the  input  object  must  be  replicated,  multiplied  by  the 
different  MSFs  at  different  spatial  locations  in  P2  and  the  correlation  of  the  input  with 
the  different  MSFs  performed.  Holographic  lens  arrays  and  HOEs  [52],  a  fixed  screen  tech¬ 
nique  [53]  or  a  rotating  grating  [54]  can  be  used  to  access  these  multiple  filters.  In  the 
latter  case,  separate  output  correlations  appear  sequentially.  In  the  other  cases,  multiple 
correlations  are  available  in  parallel  or  can  they  can  all  be  superimposed  (the  first  choice 
requires  the  analysis  of  multiple  correlation  planes  whereas  the  second  choice  results  in 
poorer  correlation  plane  SNR).  The  best  choice  depends  upon  the  application. 

4 . 2  SPECTRAL  CORRELATORS 

F.T.S.  Yu  [55],  Ludman  [56]  and  others  have  actively  pursued  the  use  of  color  or  spectral 
MSF  processors  for  image  processing  (image  subtraction,  deblurring,  etc.)  and  pattern 
recognition  [55].  These  processors  have  the  architecture  of  Figure  1  with  a  color  input 
image,  a  tricolor  grating  behind  Pi  and  a  white  light  source  at  red,  green  and  blue  wave¬ 
lengths.  This  forms  the  FT  of  the  portion  of  the  input  in  each  spectral  (color)  band  ir.  a 
different  spatial  location  in  P2*  Thus,  different  MSFs  can  be  applied  to  different  spectral 
data.  Alternatively,  objects  in  different  colors  in  the  input  will  produce  correlation 
peaks  at  Pg  in  different  wavelengths.  The  power  dissipation  and  availability  of  the  neces¬ 
sary  light  sources  is  a  practical  probleiri  with  such  architectures.  The  use  of  color  diver¬ 
sity  appears  to  best  be  utilized  as  an  adjunct  to  the  conventional  x,y  degrees  of  freedom 
of  the  system  to  simplify  system  fabrication  and  output  data  analysis  [57], 

4.3  HYBRID  OPTICAL/DIGITAL  PATTERN  RECOGNITION,  IMAGE  UNDERSTANDING  AND 


ARTIFICIAL  INTELLIGENCE 

The  use  of  pattern  recognition  (PR) ,  image  understanding  (IU)  and  artificial  intelligence 
(Ai)  techniques  in  a  hybrid  optical/digital  architecture  has  recently  been  addressed  in  an 
interdisciplinary  program  at  CMU.  A  general  diagram  of  the  architecture  is  shown  in  Figure 
10.  The  optical  portion  of  the  system  produces  features  and  correlations  with  generic  SDFs 
(see  Section  5) .  Both  optically  and  digitally  computed  features  are  considered  and  the 
optical  systems  are  adaptively  controlled  by  feedback  from  an  AI/IU  processor  that  compares 
the  results  obtained  to  a  world  model  and  which  uses  the  results  obtained  to  adaptively 
construct  and  modify  the  world  model.  Such  an  advanced  general  architecture  appears  to  be 
most  attractive  for  new  supercomputers.'  Initial  tests  on  aircraft  images,  an  on-line  tech¬ 
nique  for  producing  reference  objects  in  any  3-D  orientation  by  synthesis  of  the  object  as 
polygons,  and  related  Hough  transform  feature  representations  for  objects  appear  to  make 
such  an  architecture  most  attractive  and  realistic  for  advanced  hybrid  supercomputers. 

4.5  SUB-PIXEL  TARGET  LOCATION,  TRACKING  AND  IDENTIFICATION 


Another  recent  optical  correlator  application  under  research  at  CMU  involves  the  location 
tracking  and  identification  of  moving  sub-pixel  targets  from  space-based  mosaic  sensors. 

The  technique  used  involves:  (1)  the  correlation  of  two  successive  image  frames,  (2) 
sampling  the  central  3  x  3  or  5  x  5  region  of  the  correlation  plane,  (3)  by  estimation  de¬ 
termining  the  shift  between  two  successive  image  frames  (this  is  achieved  to  sub-pixel 
accuracy) ,  and  (4)  the  interpolation  and  resampling  to  shift  one  of  the  images  by  this 
estimated  sub-pixel  amount,  and  (5)  registration  and  the  subtraction  of  these  two  frames. 

The  shift  and  registration  must  be  performed  to  sub-pixel  accuracy  to  extract  the  target. 

In  Figure  11a,  we  show  a  typical  input  with  a  sub-pixel  target  0.2  of  a  pixel  in  size.  A 
sequence  of  three  such  frames  was  produced  with  the  background  shifted  by  0.1  pixels  frame- 
to-frame  and  with  the  target  shifted  by  one  pixel  f rame-to-frame .  The  result  after  proces¬ 
sing  (Figure  lib)  shows  the  successful  location  of  the  sub-pixel  target  and  its  relative 
position  in  each  frame.  Such  a  time-history  track  file  provides  the  necessary  information 
for  target  identification  and  classification. 
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5.  DISTORTION-INVARIANT  SH I FT- INVARIANT  OPTICAL  CORPXLATORS 

Correlators  represent  quite  powerful  pattern  recognition  processors  with  large  processir.a 
gain  and  shift-invariance  plus  the  ability  to  handle  multiple  objects  simultaneously.  This 
correlation  operation  is  easily  achieved  optically  (Figure  1).  Although  it  is  the  optimum 
detection  scheme  only  for  white  Gaussian  noise,  its  performance  in  practical  structural 
clutter  is  well-known  and  has  recently  been  theoretically  addressed  [58].  Image  processing 
is  qu-te  tolerant  of  the  dynamic  range  requirements  of  the  data.  In  fact,  binary  data  per¬ 
forms  quite  well  [59]  and  is  often  necessary  with  some  SLMs  [60).  The  susceptibility  of  a 
correlator  to  distortions  between  the  input  image  and  reference  MSF  object  are  its  well- 
Known  shortcomings.  A  basic  method  to  overcome  this  disadvantage  and  yet  retain  advantageous 
properties  of  a  correlator  is  shown  in  Figure  12.  This  method  involves  the  synthesis  of  a 
synthetic  discriminant  function  (SDF)  from  a  training  set  of  several  images  of  each  class, 
rather  than  forming  a  single  image  representation  of  an  object  in  one  orientation  (and  usmc 
multiple  such  images  in  a  multi-channel  correlator) .  The  basic  technique  used  is  to  select 
a  basis  function  set  from  the  training  images  (these  consist  of  different  distorted  views 
of  each  obiect  class)  and  from  this  synthesize  an  SDF.  An  MSF  of  this  SDF  is  then  produced 
and  used  in  an  optical  correlator  (Figure  1).  The  SDF  =  h  is  a  linear  combination  of  the 
basis  function  set  { ;  or  the  training  set  images  {fl 

h(x,y)  =  :bntn(x,y)  ,  h(x,y)  =  Ianfn<x,y>.  (12) 

This  concept  was  originated  by  Hester  and  Casasent  [61],  applied  and  demonstrated  for  intra¬ 
class  [62]  and  inter-class  [63]  recognition.  The  filter  h  and  the  associated  orthonormal 
basis  function  set  selection  by  Gram-Schmidt ,  KL  and  other  techniques  have  been  detailed 
previously  [61-63].  The  generalized  matched  filters  (GMFs)  of  Caulfield  [64,65]  are  a 
special  case  of  the  SDF  where  the  basis  function  set  t  are  the  exponentials  and  a  Fourier 
coefficient  feature  space  is  used.  No  general  solution  to  the  N2  coefficients  required  to 
be  computed  in  GMFs  has  been  advanced  and  the  system  is  not  necessarily  shift-invariant  be¬ 
cause  the  full  correlation  plane  response  is  not  specified.  The  circular  harmonic  SDFs  of 
Arsenault  [66]  use  one  circular  harmonic  in  the  expansion  of  f(r,-)  to  synthesize  the  filter 
These  filters  achieve  only  rotation-invariance  with  shift-invariance  being  a  possibility. 

In  recent  work,  Stark  [83]  noted  that  high  SNRy  may  be  required  and  that  the  choice  of  the 
center  of  expansion  and  the  harmonic (s)  used  is  not  easy  and  that  for  complex  objects  the 
resultant  processor  may  not  be  shift-invariant.  Stark  [83]  recently  offered  a  vector  formu¬ 
lation,  used  FK  techniques  and  retained  several  harmonics  in  an  improved  version  of  these 
circular  harmonic  filters.  Both  these  filters  and  the  GMFs  require  far  more  extensive  noise 
and  discrimination  tests  on  large  data  bases  before  they  can  more  properly  be  assessed. 

Since  SDFs  are  more  developed,  tested,  analyzed  and  have  a  clear  mathematical  basis  and 
synthesis  algorithm,  they  will  be  emphasized. 
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FIGURE  12 

General  synthetic  discriminant  function  (SDF)  matched  spatial  filter  (MSF) 
distortion-invariant  hybrid  correlator  concept 


SDFs  are  synthesized  from  the  vector  inner  product  matrix  V  of  the  training  set  images  by 
specifying  the  desired  correlation  plane  values  at  different  locations  (such  as  the  peak  of 
the  correlation  function) .  These  values  are  specified  by  a  vector  u.  Depending  upon  the 
application,  five  different  SDFs  are  possible.  Each  corresponds  to  a  different  vector  u  and 
matrix  V.  However,  in  each  case,  the  SDF  is  defined  by 

a  =  y_1u  (13) 

which  specifies  the  coefficients  an  in  (12)  which  define  the  SDF  =  h.  If  u  is  all  unity, 
an  SDF  with  the  same  output  correlation  peak  intensity  for  all  objects  of  one  class  (inde¬ 
pendent  of  the  geometric  distortion  chosen)  results.  With  alternate  choices  for  u,  a  two- 
class  SDF  with  unity  correlation  peak  values  for  objects  of  one  class  and  zero  peak  values 
for  objects  of  class  two  results.  Alternate  projection  values  (1,2,3)  allow  one  SDF  to 
discriminate  between  three  object  classes,  with  the  value  of  the  correlation  output  defining 
the  object  class.  Use  of  several  SDFs  and  a  truth  table  of  the  multiple  correlation  plane 
outputs  at  each  location  yields  the  final  type  of  SDF.  These  SDF  synthesis  techniques  are 
unified  in  [67]  .  Excellent  performance  on  ship  imagery  has  been  obtained  with  these  SDFs  as 
summarized  in  [68]  where  their  performance  in  noise  was  also  quantified. 

We  refer  to  these  as  projection  SDFs.  They  do  not  often  perform  adequately,  due  mainly 
to  the  fact  that  the  synthesis  algorithm  specifies  the  correlation  peak  value  at  only  one 
point  (the  center  of  the  correlation  function) .  As  a  result,  nothing  prohibits  peak 
values  above  threshold  from  occurring  for  shifted  versions  of  a  false  target  (for  which  an 
output  below  threshold,  ideally  zero  is  expected).  New  correlation  SDFs  [69]  overcome  this 
by  specifying  the  correlation  plane  values  for  true  and  false  targets  at  the  central  peak 
and  ids  pixels  away  in  x  and  y.  The  specified  value  ids  pixels  away  is  generally  zero  and 

the  value  at  the  central  peak  is  generally  one  (for  true  class  objects)  and  zero  (for  false 

class  objects) .  This  synthesis  algorithm  is  realized  exactly  as  before  with  the  inclusion 
of  shifted  versions  of  each  training  image.  This  results  in  a  well-controlled  correlation 
peak  shape  (a  large  central  peak  and  zero  or  low  values  ±ds  pixels  away)  for  true  targets 
and  zero  values  (central  and  ±ds  pixels  away)  for  false  targets.  This  also  allows  the  use 

of  both  a  peak  threshold  T?  and  a  peak  to  sidelobe  ratio  threshold  C?  to  be  applied  to  the 

output  correlation  plane  pattern  to  determine  if  a  candidate  region  of  the  input  image 
contains  a  target  and  the  class  of  that  target.  Recent  tests  performed  with  these  correla¬ 
tion  SDFs  considered  three  automatic  target  recognition  (ATR)  objects  (Tank  1,  Tank  2  and 
an  armored  personnel  carrier  ARC) .  Figure  13  shows  representative  images  of  the  APC  and  one 
of  the  tanks.  For  each  of  these  objects,  36  aspect  views  were  available  at  10°  increments 
and  a  given  depression  angle  around  the  object.  The  target  resolution  on  these  images  was 
degraded  to  about  50  x  20  pixels.  The  objective  was  to  form  an  SDF  using  only  6  or  so  dif¬ 
ferent  aspect  views  such  that  the  correlation  plane  pattern  had  a  peak  for  one  class  and  no 
peak  for  the  other  object  class.  Table  1  shows  results  [69]  obtained  with  three  different 
types  of  correlation  SDFs  intended  to  discriminate  Tank  2  from  Tank  1  independent  of  3-D 
aspect  distortions.  As  seen,  93-95%  correct  classification  with  no  missed  targets  is  possi¬ 
ble  using  only  6  aspect  views  to  synthesize  the  SDF  and  with  the  SDF  tested  against  all  72 
aspect  views  of  both  object  classes.  Table  2  shows  similar  data  for  an  SDF  to  discriminate 
APCs  from  tanks.  Here,  with  12  training  set  images/class,  we  find  perfect  performance  to  be 
possible.  Figure  14  shows  noise  test  results  when  four  targets  (2  tanks  and  2  APCs)  not 
present  in  the  training  set  were  placed  in  a  typical  scene  (Figure  14a)  with  an  input  SNR 
approximately  equal  to  one.  The  output  correlation  plane  (Figure  14b)  shows  only  two  peaks 
at  the  correct  location  of  the  two  tank  objects.  Clearly,  the  SDF  has  discriminated  against 
the  APC  targets  and  other  structured  noise  clutter  in  this  scene.  These  are  typical  of  the 
excellent  results  obtained  for  full  correlation  plane  analyses  of  the  SDF  performance  for 
3-D  distortion-invariant  multi-class  target  recognition  in  clutter. 

Recently,  considerable  attention  has  been  focused  on  the  importance  of  the  efficiency  of 
optical  correlators  (i.e.  the  usable  optical  light  in  the  output  correlation  plane  compared 
to  the  energy  of  the  input  image)  [70]  and  to  the  use  of  phase  only  MSFs  [71]  to  improve 
light  efficiency.  Butler  and  Riggins  [72]  distinguish  between  a  phase  only  MSF  in  which 
only  the  phase  of  the  MSF  data  is  recorded  on  either  an  absorption  media  or  on  a  bleached 
material.  Phase  MSFs  provide  more  useful  light  and  initial  results  indicate  that  they  pro¬ 
vide  better  discrimination  (correlation  plane  SNR).  However,  only  limited  simulations  on 
two  similar  letters  were  performed  [71]  and  no  theoretical  basis  has  yet  been  advanced  for 
this  result.  The  motivation  for  this  recent  attention  and  an  entire  conference  session  [72] 
on  CGH  realization  of  SDFs  is  the  excellent  performance  of  these  filters,  the  availability 
of  several  commercial  CGH  recorders  and  the  use  of  such  CGH  filters  in  the  fabrication  of  a 
compact  SDF-based  correlator.  Giamno  and  Horner  [75J  quantified  by  simulation  the  expected 
worse  sensitivity  of  phase-only  MSFs  with  respect  to  object  distortions.  Thus,  the  use  of 
SDFs  with  phase-only  CGHs  is  a  natural  approach  to  consider.  It  provides  better  efficiency 
and  overcomes  the  distortion  sensitive  performance  of  conventional  filters.  Riggins  and 
Butler  [74]  recently  simulated  the  original  equal  correlation  peak  projection  SDF  with  a  CGH 
They  found  1%  light  efficiency  and  good  performance  on  ATR  data.  However,  much  more  exten¬ 
sive  tests  are  required  on  the  new  advanced  SDFs  and  on  larger  data  bases.  Kumar  et  al  [76] 
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Since  these  prior  tests  c!id  not  employ  the  new  correlation  SDFs,  w,  synthesized  a  corre¬ 
lation  SDf-3  of  the  ship  and  tested  it  against  real  ship  imagery  with  different  amounts  of 
amplitude  and  phase  data  retained.  We  found  that  the  phase-only  filter  performed  well,  but 
that  retaining  two  bits  of  amplitude  MSF  data  ar.d  two  bits  of  phase  data  gave  significant  1 v 
better  results.  Thus,  from  these  recent  tests  of  ours,  it  appears  that  a  filter  with  s ore- 
amplitude  data  present  is  preferable  to  a  phase-only  filter.  Considerable  future  wore  ar.u 
results  are  anticipated  in  this  area. 

6.  SYSTEM  FABRICATION 

A  2-D  real-time  SLM  is  the  key  element  for  a  successful  parallel  2-D  OPR  system.  The 
state-of-art  of  these  devices  is  summarized  in  [5],  Recent  Soviet  work  has  resulted  ir.  his:, 
perform, ance  PRIZ  [81]  and  liquid  crystal  SLMs  with  high  sensitivity,  resolution  and  effi¬ 
ciency  and  with  unique  properties  such  as  directional  spatial  filtering,  edge  enhancement 
and  the  ability  to  detect  and  respond  only  to  dynamic  removing  input  objects  [81].  Mar.v 
real-time  optical  correlators  have  been  fabricated,  described  and  demonstrated.  The  Getc-rul 
Motors  system  for  robot  inspection  [77]  is  one  such  system  which  used  a  liquid  crystal  real 
time  input  SLM.  Two-dimensional  output  readout  was  simplified  and  rotational  invanar.ee 
accomplished  by  use  of  a  rotating  prism  and  a  cylindrical  optical  system  using  two  1-D  detector 
arrays,  rather  than  a  2-D  readout  array.  Several  multi-channel  real-time  optical  correla¬ 
tors  using  a  liquid  crystal  input  transducer  and  multiple  MSFs  have  been  fabricated  at 
Huntsville  as  described  in  [78],  In  these  systems,  attention  was  given  to  filter  synthesis 
using  weighted  MSFs  [82]  to  reduce,  rather  than  overcome,  scale  and  angular  object  sensitiv¬ 
ity  and  hence  the  number  of  multiple  filters  needed.  Multiple  MSFs  on  the  same  filter  were 
tested  on  these  systems  for  light  efficiency  and  spatially-separated  MSFs  were  accessed  by 
different  laser  diodes  in  different  input  spatial  locations.  In  the  first  svstern,  all 
correlation  plane  patterns  were  superimposed.  In  the  second  system,  different  laser  diode 
sources  allowed  separate  MSFs  to  be  accessed  when  the  difference  between  the  orientation  or 
scale  of  the  input  and  reference  object  caused  the  correlation  peak  to  drop  sufficiently. 

A  recent  magneto  optic  SLM  [43]  offering  low  cost  has  been  developed  anc  demonstrated  for 
simple  white  light  spatial  filtering  [80]  image  processing  functions  (rather  than  correla¬ 
tions)  and  for  low  space  bandwidth  product  CGH  MSF  correlations  [60]  .  The  binary  (rather 
than  gray  scale)  response  of  this  SLM  and  its  present  low  resolution  and  low  transmittance 
are  limitations  that  must  be  overcome  before  it  will  see  general  use. 

All  of  these  prior  real-time  optical  correlator  systems  did  not  attempt  to  significantly 
reduce  the  physical  size  of  the  optical  system.  A  compact  portable  version  of  the  Huntsville 
optical  correlator  was  recently  fabricated  by  ERIM  [79]  and  is  shown  in  Figure  15.  This 
system  uses  four  laser  diode  sources  to  access  one  of  four  spatially-multiplexed  MSFs  with 
the  output  correlation  plane  detected  by  a  2-D  charge  injection  device  (CID)  detector  array. 
The  correlation  unit  is  15  x  23  x  42  cm  and  weighs  8  kg.  Electronic  support  unit  for  it 
is  15  x  26  x  35  cm  and  weighs  8  kg  also.  The  total  power  consumption  of  this  portable 
compact  optical  correlator  is  55W.  It  is  possible  to  fabricate  far  smaller  and  lower  power 
dissipation  versions  of  this  architecture  and  several  of  these  are  presently  being  con¬ 
sidered  . 

All  prior  well-engineered  real-time  optical  correlators  have  used  only  simple  or  several 
simple  multiple  MSFs  and  have  thus  achieved  cnly  limited  distortion-invariant  pattern  recco- 
nition.  While  the  physical  size  of  these  processors  is  significantly  less  than  the  classic 
large  optical  bench  processors,  they  are  not  yet  compact  enough  for  use  in  a  missile.  A 
more  practical  optical  correlator  would  be  one  which  employed  the  advanced  SDF  MSFs  and  one 
which  was  significantly  smaller  in  size.  The  use  of  SDF  filters  would  reduce  the  complexity 
of  the  system  and  extend  its  practicality  and  versatility.  The  system  of  Figure  16  was 
recently  fabricated  by  General  Dynamics-Pomona  and  demonstrated  in  initial  tests  using  com¬ 
puter  generated  hologram  SDF  MSFs.  The  system  is  less  than  5  inches  in  diameter  and  approxi¬ 
mately  12  inches  in  length.  It  is  intended  for  use  in  a  5  inch  missile  for  on-line  real¬ 
time  ATR  pattern  recognition.  It  employs  folded  optics,  mirrors  rathir  than  lenses,  multi¬ 
ple  SDFs,  several  output  correlation  planes,  and  presently  a  real-tim  liquid  crystal  SLM. 
Tower  tests  and  captive  helicopter  tests  of  this  system  are  expected  to  be  amona  the  high¬ 
lights  of  OPR  work  in  1985.  This  real-time  optical  correlator  of  Figure  16  represents  the 
first  such  processor  suitable  for  airborne  use  in  a  5  inch  missile  that  has  reached  hardware. 
Further  such  committments  and  research  support  by  government  and  industry  are  essential  to 
provide  the  necessary  transfer  of  technology  from  OPR  research  to  airborne  hardware. 

7,  SUMMARY  AND  CONCLUSION 

A  brief  review  of  the  major  operations  achievable  and  the  major  OPR  architectures  has 
been  provided.  This  was  followed  by  descriptions  of  nine  different  optical  feature  extractor 
hybrid  pattern  recognition  processors.  These  systems  optically  produced  all  of  the  major 
image  features  using  the  parallism  of  optics.  Feature  extraction  and  classification  or,  these 
optically  generated  features  is  then  performed  in  a  digital  post-processor .  The  resultant. 


hybrid  optical /digital  systems  combine  the  best  advantages  of  optical  and  digital  processor 
These  techniques  are  quite  noteworthy  because  the  same  optical  architecture  can  compute 
the  indicated  features  for  any  input  object  and  thus  the  same  system  is  usable  for  any  ob¬ 
ject  identification  application.  The  discriminant  function,  feature  extractor,  transforma¬ 
tion  and  classifier  used  can  be  changed  as  desired  by  employing  the  flexibility  of  the  post 
processor.  In  all  cases,  multi-class  3-D  distortion-invariant  pattern  recognition  is  the 
objective  considered.  Extensive  tests  have  been  made  on  several  of  these  systems  on  large 
data  bases.  These  include  a  large  number  of  related  objects  in  different  classes  with  36 
different  aspect  views  of  each  object  (at  every  10°  increment)  from  a  20-40°  depression 
angle.  Thus,  this  represents  a  multi-class  full  3-D  distortion  problem.  Training  sets  con 
taining  only  4-6  different  spatial  views  per  object  class  were  found  to  be  adequate  to  pro¬ 
vide  excellent  86-98%  correct  object  recognition  and  classif ication  identification  in  over 
300  test  images  in  one  data  set  and  over  175  test  images  in  a  second  data  base. 
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FIGURE  15 

Compact  portable  real-time  liquid  crystal  optical 
correlator  using  4  matched  spatial  filters  [79] 
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FIGURE  16 

Photograph  of  the  General  Dynamics-Pomona  airborne  real-time  optical  SDF  correlator 
for  packaging  in  a  5"  missile  (Photo  courtesy  of  D.  Fetterly,  General  Dynamics-Pomona) 


V 

V 

n 


'Jmrj*  *_1T».>  J*  >  > 


( 


i 

> 

r“. 

'w* 

A 


1,1  I  V  «T***T  ,IM'.'.,,pr,I,  l  ",l  ■ 


5  ' 


When  performance  in  high  clutter  and  noise  is  required,  a  correlator  is  needed.  With 
SDFs,  a  3-D  distortion-invariant  multi-class  pattern  recognition  correlator  is  possible 
with  all  of  the  advantages  of  a  correlator  retained  and  with  distortion-invariance  provided. 
Excellent  initial  test  results  of  the  full  correlation  plane  data  were  presented  and  a 
typical  example  of  the  performance  in  clutter  of  the  system  was  included.  The  final  issue 
in  an  OPR  system  is  system  fabrication.  As  shown,  significant  strides  have  recently  been 
made  in  this  area  with  an  advanced  SDE-based  real-time  optical  correlator  package  for  a 
five  inch  missile  having  been  fabricated  and  initially  tested.  The  world  of  optical  pattern 
recognition  has  a  bright  and  attractive  future  in  all  aspects.  Further  government  and 
corporate  committments  are  still  necessary  to  insure  timely  transfer  of  this  technology  to 
hardware  however. 
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ABSTRACT 

Diffraction  pattern  sampling  provides  a  feature  space  suitable  for  object  classification, 
orientation  and  inspection.  It  allows  significant  dimensionality  reduction.  These  proper¬ 
ties  are  best  achieved  by  the  use  of  specifically-shaped  Fourier  transform  plane  detector 
elements  and  this  can  be  realized  with  considerable  flexibility,  reduced  size  and  improved 
performance  by  the  use  of  computer  generated  holograms. 

1 .  INTRODUCTION 


The  Fourier  transform  (FT)  or  diffraction  plane  of  an  object  contains  a  distribution  of 
the  spatial  frequencies  present  in  the  input  object.  This  distribution  has  many  attractive 
properties.  The  magnitude  of  the  FT  pattern  is  shift-invariant.  Thus,  translations  of  the 
input  image  do  not  effect  the  magnitude  of  the  Fourier  coefficients.  Higher  horizontal  or 
vertical  input  spatial  frequencies  (u,v)  lie  further  from  the  center  (dc  or  zero  spatial 
frequency)  of  the  FT  plane 


( u , v )  =  (x2/XfL,y2/XfL) .  (1) 

Input  spatial  frequencies  oriented  at  an  angle  in  the  input  plane  appear  at  a  radial  distance 
r  =  (x22+V2  >  in  the  FT  Plane  (where  (x2,y2>  are  the  distance  coordinates  of  the  FT 
plane)  and  at  an  angle  orthogonal  to  the  orientation  of  the  input  data  [1],  As  the  orientation 
of  the  input  spatial  frequencies  varies,  the  angle  6  of  the  FT  distribution  also  rotates. 

As  the  scale  of  the  input  object  changes,  the  radial  distance  at  which  the  frequency  peaks 
are  located  also  scales.  Thus,  spatial  frequency  and  orientation  information  are  conven¬ 
iently  available  in  an  FT  plane  representation.  Also,  Buch  an  FT  plane  representation  is 
most  suitable  for  dimensionality  reduction  of  the  data.  This  issue  is  of  considerable 
practical  importance  since  the  space  bandwidth  product  (SBWP)  or  number  of  frequency-plane 
components  required  to  represent  the  input  object  is  equal  to  the  input  SBWP.  Thus,  no 
advantage  is  obtained  by  use  of  a  FT  plane  data  representation  (in  terms  of  processing  re¬ 
quirements)  ,  unless  dimensionality  reduction  is  employed.  Fortunately,  an  FT  plane  is  well- 
known  to  allow  considerable  data  compression,  especially  for  pattern  recognition  and  object 
identification  applications.  Hence,  an  appropriately-sampled  FT  plane  provides  a  set  of 
features  that  are  most  useful  for  feature  extraction  based  pattern  recognition  and  object 
identification . 

In  Section  2,  we  review  the  FT  properties  and  prior  approaches  to  efficient  FT  plane 
sampling  using  elements  such  as  the  wedge  ring  detector  (WRD) .  This  section  provides 
motivation  for  our  research.  In  Section  3,  we  describe  our  computer  generated  holo^r^m 
(CGH)  WRD  FT  plane  concept  and  in  Section  4  we  detail  our  synthesis  approach  for  a  WRD  U6>a 
CGHs .  Section  5  provides  initial  experimental  results  obtained  using  our  CGH  generated  WRD 
element.  Advanced  analysis  issues  and  our  summary  and  conclusions  associated  with  this  sys¬ 
tem  are  then  advanced  in  Section  6. 

2.  WRD  PROPERTIES  AND  FEATURES 


For  completeness,  we  first  review  several  common  FT  properties  of  use  in  WRD-sampled  FT 
plane  analysis.  We  consider  real  input  functions  f(x,y),  i.e.  images.  Their  intensity  FT 
is  symmetric,  i.e. 


|  F (u , v)  |2  =  |F(-u,-v)  |2. 


(2) 


The  intensity  FT  is  also  shift-invariant 

|Jf[f  (x-a) ,y-b) ] | 2  «  I F (u,v)  | 2 , 
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The  shift-invariant,  rotation  and  scale  properties  in  (3)  -  (5)  make  sampling  of  the  FT 
pattern  intensity  with  wedge  and  ring  shaped  detector  elements  most  attractive.  The  typical 
optical  arrangement  used  is  shown  in  Figure  1.  The  input  object  is  placed  in  plane  Pj,  its 
FT  pattern  is  formed  at  P2  by  lens  Lj,  where  its  intensity  is  sampled  by  a  WRD.  This  detec¬ 
tor  has  wedge-shaped  elements  in  one-half  of  the  circular  aperture  and  ring-shaped  elements 

in  the  other  half  of  the  circular  aperture.  Figure  2A  shows  this  detector  schematically. 

In  the  version  of  this  device  that  was  fabricated  and  was  commercially  available,  there  were 
32  wedge  and  32  ring  shaped  detector  elements  in  each  half  of  a  one  inch  diameter  silicon 

sensor  (Figure  2B) .  The  64  detector  outputs  are  available  in  parallel  and  are  fed  through 

amplifiers  to  autoranging  amplifiers  and  potentially  into  a  supporting  digital  processor  for 
analysis  purposes.  The  output  of  any  detector  element  can  be  manually  selected  and  viewed 
on  a  digital  display  or  the  full  (or  any  partial  set  of  64  detector  outputs)  can  be  selected 
automatically  scanned  and  fed  to  a  digital  processor.  Figure  2C  shows  the  standard  control 
unit.  From  (4),  the  FT  pattern  is  seen  to  rotate  as  the  input  object  rotates.  Since  the 
ring  shaped  detector  outputs  integrate  over  6,  the  f(r)  ring  shaped  detector  output  distri¬ 
bution  does  not  change  with  input  object  rotations.  From  (5),  the  FT  pattern  is  seen  to 
scale  inversely  with  changes  in  scale  of  the  input  object.  Since  the  wedge  shaped  detector 
outputs  integrate  over  r,  the  f ( 6 )  wedge  shaped  detector  output  distribution  does  not  change 
with  input  object  scale  changes.  From  (1) ,  one  can  place  the  wedge  shaped  detector  elements 
in  one-half  of  the  FT  plane  and  the  ring  shaped  detector  elements  of  the  other  half  of  the 
FT  plane  with  no  loss  of  information  (beyond  that  which  occurs  due  to  intensity  sampling) . 

3.  CGH/Holoqraphic  Optical  Element  (HOE)  WRD  CONCEPT 

These  concepts  were  first  introduced  by  Stanley  and  Lendaris  [1]  and  later  exploited  by 
Recognition  Systems  Incorporated  [2] .  They  find  much  use  as  mission  screeners  in  the  iden¬ 
tification  of  the  class  of  different  parts  of  an  input  scene  [1],  in  object  quality  inspec¬ 
tion  [2],  line  width  analysis  for  ICs,  handwriting  analysis  [3],  for  producing  a  generalized 
chord  distribution  feature  space  [6] ,  and  in  more  recent  work  for  object  identification  and 
classification  [4] .  Although  these  diffraction  pattern  concepts  are  attractive,  there  are 
several  shortcomings  with  the  present  silicon  detector  units.  These  include:  the  lack  of 
availability  of  such  silicon  detectors,  the  desire  to  achieve  more  compact  units  of  smaller 
physical  size  and  weight,  the  attractiveness  of  often  wanting  a  wider  variety  of  detector 
shapes,  the  frequent  need  for  more  sensitive  and  higher  speed  detectors  than  one  can  obtain 
with  the  wide  area  units  necessary  when  fabricated  in  silicon.  One  can  separate  the  detec¬ 
tion  function  and  the  specific  sampling  shape  aspect  of  the  detector  elements  by  sensing 
the  FT  pattern  using  a  conventional  2-D  grid  scan  pattern  and  then  digitally  implementing 
various  desired  detector  shape  functions.  The  interpolation  required  to  accurately  model 
the  detector  shape  desired  is  a  significant  overhead  in  a  digital  realization  and  often  pre¬ 
cludes  real-time  operation.  Hence,  an  optical  realization  using  a  CGH  to  achieve  the  de¬ 
sired  sampling  function  and  a  linear  array  of  separate  high-performance  detectors  with 
parallel  outputs  is  preferable.  CGHs  and  holographic  optical  elements  (HOEs)  are  presently 
receiving  considerable  attention  [5]  with  the  availability  of  several  commercial  CGH  re¬ 
corders.  Thus,  this  optical  approach  is  also  of  considerable  practical  and  current  interest 

The  CGH/HOE-based  compact  system  we  envision  use  of  is  shown  in  Figure  3A  and  in  block 
diagram  form  in  Figure  3B.  The  FT  of  the  input  object  is  formed  at  P2  where  a  CGH  and  HOE 
are  placed.  The  CGH  has  different  grating  patterns  in  different  regions,  with  each  region 
having  a  different  shape  and  location  (corresponding  to  the  specific  detector  shapes  re¬ 
quired)  .  In  each  region,  the  grating  is  of  one  spatial  frequency  and  one  orientation  (the 
spatial  frequency  and  orientation  differ  in  each  region)  and  determines  the  location  in  P3 
where  the  data  in  each  P2  region  focuses.  An  HOE  recorded  on  the  CGH  plate  at  P2  achieves 
the  focusing  of  each  P2  region  to  a  separate  point  in  P3. 

In  practice,  the  CGH/HOE  could  be  reflective  and  a  folded  optical  system  of  reduced  size 
would  result.  The  separate  wedge  and  ring  outputs  (or  other  FT  plane  sampling  shapes  de¬ 
sired)  are  produced  in  spatially-separated  regions  of  P3.  Detector  arrays  or  discrete  detec¬ 
tors  placed  at  P3  provide  parallel  outputs  corresponding  to  the  wedge  and  ring  sampled  FT 
plane  data.  This  separation  of  the  sampling  and  detection  functions  allows  high-speed  and 
high-sensitivity  detectors  to  be  used.  These  parallel  outputs  would  then  be  fed  to  a  dedi¬ 
cated  digital  processor  to  perform  feature  extraction  (selection,  weighting  and  combining  of 
the  different  wedge  and  ring  detector  outputs  as  required  for  a  given  application)  in  this 
wedge/ring-sampled  FT  feature  space  and  estimation  of  the  class,  orientation  and  scale  of 
the  input  object.  The  classification  (for  out-of-plane  distortions)  is  performed  by  projec¬ 
ting  the  wedge/ring-sampled  FT  feature  vector  onto  a  discriminant  vector  selected  by  various 
pattern  recognition  techniques  [4J. 


Plane  P2  need  not  be  an  FT  plane.  If  it  is  an  autocorrelation  plane,  then  the  wedge/ring 
features  produced  are  the  chord  distributions  [6-8].  As  noted  earlier,  with  a  CGH,  one  is  not 
restricted  to  wedge  and  ring  sampling,  but  any  desired  sampling-shaped  function  can  be  used. 
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FIGURE  3 

Preferred  computer  generated  hologram  (CGH)/holographic  optical  element  (HOE) 
realization  of  an  optical  wedge  ring  detector  (WRD)  system 


4 ■  CGH  WRD  DESIGN 

A  schematic  and  block  diagram  of  the  laboratory  WRD/CGH  system  are  shown  in  Figures  4A 
and  4B  respectively.  The  inputs,  P2  CGH  and  output  P3  detector  plane  coordinates  are 
shown  in  Figure  4.  The  CGH  at  P2  achieves  the  desired  wedge-ring  sampling  and  diffracts 
all  light  (incident  on  each  separate  wedge  and  ring  sampled  P2  region)  at  a  different  angle 
(proportional  to  the  spatial  frequency  and  orientation  of  the  grating  present  in  each  P2 
region) .  Lens  L2  focuses  the  parallel  light  from  each  wedge-ring  P2  region  to  a  different 
location  in  P3,  wnere  separate  high-performance  detectors  collect  this  light  and  provide  the 
desired  wedge-ring  sampled  output  data  in  parallel.  This  WRD  sampling  and  detection  tech¬ 
nique  using  a  CGH  is  preferable  to  the  holographic  recording  of  the  necessary  pattern  in 
each  P2  region  as  proposed  in  Ref. [9].  Our  proposed  CGH  technique  requires  no  sophisticated 
optical  system  for  recording  and  is  thus  simpler  and  cheaper.  It  allows  phase  relief  CGH 
recordings  to  be  used  and  thus  has  the  seme  high-efficiency  advantage  of  the  technique  in 
Ref. [9]  when  using  bleached  dichromated  gelatin,  but  with  much  easier  fabrication,  with 
greatly  increased  flexibility  and  at  a  significantly  lower  cost. 

For  simplicity,  we  describe  each  region  of  the  CGH  by  a  square-wave  grating  of  unit  ampli¬ 
tude  varying  in  x  only  as 

g(x)  -  (Rect(^)  *  Comb(^)  ]Rect(£)  ,  (6) 

where  Ax  is  the  width  of  each  bar  in  the  grating,  d  is  the  grating  spacing,  uj  =  1/d  is  the 
grating  frequency,  L  is  the  grating's  extent  and  Comb(x/d)  =  |d | 16 (x-nd) .  We  could  employ 
a  sinewave  grating  in  each  P2  plane  region.  However,  a  square-wave  grating  is  more  easily 
fabricated  using  a  binary  CGH  or  a  binary  recorder.  Use  of  a  sinewave  grating  would  result 
in  only  one  diffracted  order,  a  slight  increase  in  useable  light  and  would  not  require 
attention  to  avoiding  higher  diffracted  orders.  However,  as  we  show  in  this  paper,  the 
present  design  with  a  square-wave  grating  represents  no  problem,  achieves  adequate  light 
budget  efficiency,  requires  lower  resolution  than  a  sinewave  grating.  Primarly  the  use  of 
a  square-wave  grating  allows  simpler  binary  recording  systems  to  be  employed.  With  g(x)  in 
(6)  placed  at  P2,  its  FT  is  formed  at  P3  in  Figure  4  and  is 


V  V*  .*  VV  ' 
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G(u)  =  L  •  6x  •  d [Sine (uAx) Comb (ud) ]  *  Sinc(uL),  ( 

where  u  ■  *3/lfL2  relates  spatial  frequencies  u  at  P2  to  distance  X3  in  P3.  Eq.(7)  shows 

that  the  data  from  one  such  1-D  grid  produces  a  P2  pattern  containing  sine  functions  of 
width  1/L  replicated  every  1/d  with  an  overall  amplitude  weighting  across  all  of  the  sine 
functions  given  by  a  sine  function  of  large  width  1/Ax. 
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FIGURE  4 

(A)  Schematic  and  (B)  block  diagram  of  a  laboratory  wedge  ring  detector  (WRD)  computer 
generated  hologram  (CGH)  holographic  optical  element  (HOE)  system. 


The  location  in  P3  of  the  grating  data  in  the  corresponding  P2  region  is  thus 

x3  “  XfL2/d'  (8) 

where  d  is  the  spacing  between  two  gratings  square-wave  rectangular  pulses.  The  detector 
plane  P3  size,  the  size  of  each  detector  element,  and  the  length  of  the  system  (fL)  deter¬ 
mine  d  and  the  angle  6  for  each  grating  region  of  P2-  We  consider  a  circular  CGH  of  radius 
R  with  wedge  shaped  elements  in  the  upper  half  and  ring  shaped  elements  in  the  lower  half. 
The  highest  spatial  frequency  um  in  the  Pj  input  image  determines  the  radius  required  for  the 
CGH  as 

R  —  *fL1um  '  ,9) 

We  consider  two  detector  formats:  a  rectangular  array  (Figure  5A)  and  two  circularly- 
symmetric  detector  arrays  (Figure  5B)  .  The  rectangular  detector  array  offers  the  use  of  a 
simpler  commercial  detector  system  with  higher  CGH  requirements.  The  circular  detector 
arrays  require  a  far  simpler  CGH  but  individual  detectors  in  a  nonstandard  and  therefore 
less  commercially  available  array  configuration.  Both  CGHs  have  the  general  form  shown  in 
Figure  5C  with  wedge  shaped  sampling  elements  in  one  half  and  ring  shaped  elements  in  the 
other  half  of  the  plane.  Each  CGH  region  contains  a  grating  of  spatial  frequency  d^j  and 
angle  6:j,  where  the  subscripts  correspond  to  the  associated  detector  element.  For  simplic¬ 
ity,  only  one  wedge  and  ring  grating  pattern  are  shown  in  Figure  5C. 
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FIGURE  5 

Output  plane  P3  detector  geometries  (A)  rectangular  detector, 

(B)  concentric  detector  arrays,  and  (C)  basic  WRD  CGH  geometry 

4 . 1  RECTANGULAR  DETECTOR  ARRAY 

For  the  rectangular  detector  array  (with  square-wave  CGH  gratings) ,  we  require 

H  >  2b 

to  insure  that  the  second-order  terms  from  the  CGH  do  not  fall  on  the  detector  array  and 
that  the  position  of  the  detector  array  in  P3  is  offset  from  the  optical  axis  by 

H'  >  b 

for  similar  reasons.  These  conditions  in  (10)  and  (11)  are  most  easily  derived  for  the 
vertical  detectors  i  in  column  j  «  1.  If  the  grating  spacing  satisfies 


d  (i,  1)  ■=  XfL2/h(i,l)  -  XfL2/[H-(i-l)s]  (12 

for  detector  element  (i,l)  in  the  first  column  ( j=l) ,  then  the  second-order  is  insured  to 
fall  outside  the  detector  array  for  the  other  detector  array  elements,  where  h(i,j)  de¬ 
notes  the  distance  from  the  origin  of  P3  associated  with  detector  element  (i,j).  If  the 
detector  size  s  is  fixed,  so  is  b  and  the  minijmiri  grating  spacing  dm  (maximum  spatial  frequency 
d^1)  must  satisfy 

d  <  ( Xf  ,/2b) cos  6  ,  (13 

where  the  term  in  parentheses  is  the  grating  spacing  required  for  the  top  right  detector  element  in  Fig.SA. 

To  produce  diffracted  light  focused  onto  a  row  of  spots  at  each  line  in  P3,  the  gratings 
in  each  region  of  Pj  must  be  oriented  at  an  angle  8(i,j)  (Figure  5C)  to  the  horizontal  x 
axis  satisfying 

8(i,j)  =  Arctan{ ( j-1) s/ iH-(i-l) s] } .  (14 

The  grating  separation  in  the  P2  region  corresponding  to  detector  (i,j)  must  thus  satisfy 

d  (i, j)  -  {XfL2/ [H-(i-l) s) }Cos [6  (i, j) )  .  (15 

Eqs.(14)  and  (15)  define  the  grating  spatial  frequency  required  in  each  P2  region  subject  to 
the  grating  spacing  constraints  on  d  in  (12)  and  (13).  This  CGH  design  requires  a  grating 
with  period  d  inclined  at  an  angle  8  to  be  recorded  in  each  P2  region  with  a  different  d  and 
8  for  each  region  (Figure  5C) .  This  requires  considerable  resolution  and  accuracy,  compared 
to  our  concentric  detector  array  system. 

4,2  CONCENTRIC  CIRCULAR  DETECTOR  ARRAY 

For  the  circular  detector  array  configuration  (Figure  5B) ,  the  CGH  design  is  far  simpler 
than  the  case  considered  in  Section  4.1.  In  this  present  system,  the  wedge  shaped  detector 
elements  lie  in  the  top  part  of  the  CGH  and  the  ring  shaped  elements  lie  in  the  bottom  por¬ 
tion.  The  grating  spacing  d»  is  fixed  for  all  wedge  regions  and  only  8  is  varied  between  P2 


^-2! 

regions.  The  grating  spacing  dR  for  all  ring  regions  is  also  constant  and  again  only  the 
angle  8  of  the  grating  is  varied  between  ring  regions.  If  a  Calcomp  plotter  is  used  to 
synthesize  this  CGH ,  the  end  points  of  each  line  are  specified  and  the  plotter  draws  the 
desired  line  at  the  necessary  angle.  In  a  CGH  recorder,  the  coordinates  of  each  point  are 
generally  required  and  thus  sampling  effects  will  be  of  more  concern.  This  issue  is  common 
to  both  detector  array  cases,  since  the  grating  in  each  P2  region  is  at  a  different  angle  in 
both  detector  cases.  The  CGH  pattern  at  P2  for  the  rectangular  detector  array  and  the  cir¬ 
cular  detector  array  are  similar  as  shown  in  Figure  5C.  The  first-order  diffracted 
radii  hw  and  hR  for  the  wedge  and  ring  gratings  of  spacing  d^  and  dR  must  satisfy 

hW  =  XfL2/dW'  hR  =  *fL2/dR‘  116 

We  can  avoid  overlapping  of  the  first  and  second-orders  by  selecting 

2hw  -  hR  >  sd  and  hR  -  1^  >  sd,  (17 

where  sd  is  the  diameter  of  a  detector.  Each  grating  produces  +  and  -  diffracted  orders. 

The  inner  circle  of  peaks  in  Figure  5B  corresponds  to  these  +  and  -  orders  for  the  wedges 
and  the  outer  circle  corresponds  to  these  for  the  ring  elements.  If  there  are  M  wedges  in 
the  top  half  of  the  CGH  and  M  rings  in  the  bottom  half,  then  the  bisector  for  the  i-th  wedge 
region  is  a  line 

y  =  K(i)x,  where  K(i)  =  Tan  (  (ir/M)  ( i  — 0 . 5)  ]  .  (18 

The  line  perpendicular  to  the  bisector  is  y  =  [-1/K(i)]x+C  and  the  angle  that  grating  i 
makes  with  the  +x  axis  is  thus 

8 ( i )  =  arctan  v  .  (19 


For  simplicity  (Figure  5B) ,  the  same  grating  angles  are  used  for  both  the  wedge  and  ring 
gratings . 

The  last  design  issue  we  consider  is  the  diffracted  spot  size  S2  on  the  CGH,  its  diffrac¬ 
ted  spot  size  53’  on  the  detector,  and  the  size  S3  of  an  input  image  region  of  one  uniform 
spatial  frequency,  and  the  size  S3  =  sd  of  a  P3  plane  detector.  This  is  a  unique  issue  and 
requirement  for  CGH/WRD  systems.  One  spot  of  diameter  S2  at  P2  will  produce  a  spot  of 
diameter  83’  *=  2XfL2/s2  at  the  detector  plane.  This  P2  spot  diameter  is  due  to  a  region  in 
Pj  of  minimum  diameter  83  =  2Xfi,i/s2.  For  simplicity,  1.22  factors  have  been  omitted  in  the 
above  spot  size  equations.  The  detector  size  sd  =  S3  must  thus  satisfy 

2XfL2/s2  <  s3  <  XfL2f (l/dR)-(l/dw) ]  (20a 

s3  <  27^/M  (20b 

where  the  left  side  of  (20a)  insures  S3  >  S31  to  collect  all  diffracted  light  from  P2,  the 
right  side  of  (20a)  insures  that  the  wedge  and  ring  detectors  do  not  overlap  and  (20b)  in¬ 
sures  that  the  wedge  detectors  themselves  (lying  at  a  radius  hw)  do  not  overlap.  We  will 
quantify  these  values  for  our  experimental  system  shortly. 

5.  EXPERIMENTAL  RESULTS 

The  experimental  results  for  two  WRD  CGHs  with  M  =  10  wedge  elements  and  10  ring  elements 
follows. 

5.1  RECTANGULAR  DETECTOR  ARRAY 

For  this  case,  a  4  x  5  detector  array  (Figure  5A)  is  used  with  a  «  7.5  mm  and  b  =  6.0  mm 
with  s  «  1.5  mm  detectors  on  1.5  mm  centers.  For  the  experiment  performed,  fp,2  “  815  mm  and 
X  »  0.465  um  (an  argon  laser  line).  To  satisfy  H  >  2b  in  (10),  we  selected  H  •  19  mm  for  a 
corresponding  minimum  grating  spacing  dm=0.0186  mm  from  (13).  The  grating  spacings  and 
angular  orientation  for  each  region  were  selected  from  (14)  and  (15).  The  grating  spacing 
varied  from  0.020  to  0.026  mm  for  the  detectors  in  the  first  column  (J=l)  and  the  grating 
angle  for  the  detectors  in  the  j=2  column  varied  from  4.38°  to  5.73°.  Over  the  entire  detec¬ 
tor  array,  d  varied  from  0.0186  to  0.0260  and  9  varied  up  to  30°.  The  general  design 
guidelines  for  the  grating  spacings  and  grating  angles  for  the  CGH  region  corresponding  to 
detector  (i,j)  «  (vertical , horizontal)  satisfy  (14)  and  (15)  where  i  =  1  and  2  for  wedges 
and  i  «  3  and  4  for  ring  elements.  The  end  points  of  each  grating  line  were  specified  and  a 
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line  drawn  between  them  using  our  Calcomp  plotter.  The  full  plot  was  10"  =  254  mm  in  diam¬ 
eter  with  the  smallest  grating  interval  being  0.635  mm  (the  Calcomp  plotter  easily  produced 
lines  with  spacings  of  0.02”  =  0.5  mm,  i.e.  well  within  our  0.635  mm  requirements).  This 
plot  was  photoreduced  by  32.5:1  to  7.8  mm  diameter  with  dm  =  0.02  mm.  The  2R  =  7.8  mm  diam¬ 
eter  allows  a  maximum  input  spatial  frequency  um  =  21  cy/mm  (assuming  X  ■=  465  pm  and 
fLi  *=  400  mm),  which  is  more  than  adequate  for  realistic  imagery. 


FIGURE  6 

(A)  Computer  generated  hologram  and  (B)  output  plane  pattern  for  a  wedge  ring  detector 
computer  generated  hologram  with  an  output  rectangular  detector  array 
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Figure  6A  shows  the  CGH  used  and  Figure  6B  shows  the  2-D  rectangular  P3  output  diffrac¬ 
tion  pattern  obtained  when  this  CGH  is  illuminated  with  a  plane  wave.  The  upper  two  rows 
correspond  to  the  ten  wedge  outputs  and  the  bottom  two  rows  to  the  ten  ring  outputs.  The 
wedges  and  the  inner  rings  in  Figure  6A  are  not  as  easily  visible  because  of  their  higher 
grating  spatial  frequencies.  The  number  of  grating  lines  in  the  wedge  regions  varied  from 
52  to  167  and  the  number  of  lines  in  the  ring  regions  varied  from  17  to  227.  This  accounts 
for  the  different  intensity  (larger  spot  sizes)  in  Figure  6B.  The  spots  diffracted  by  the 
wedge  reqions  are  more  uniform.  Because  the  area  and  number  of  lines  in  each  ring  region 
varies,  their  light  intensity  varies  more  as  is  seen.  The  locations  of  the  diffracted  out¬ 
put  peaks  are  in  agreement  with  theory  within  measurement  accuracy. 

5.2  CONCENTRIC  DETECTOR  ARRAY 

The  CGH  for  this  case  (Figure  7A)  and  the  resultant  P3  diffracted  pattern  (Figure  7B) 
again  agree  with  theory.  The  same  X  and  fL2  ate  used.  The  original  CGH  produced  by  the 
Calcomp  plotter  for  this  case  was  8"  •  203.6  mm  in  diameter  with  the  seme  dw  =  0.042"  = 

1.07  mm  for  all  wedges  and  dR  “  0.76  mm  for  all  rings.  After  photoreducing  by  20.7:1,  the 
CGH  had:  2R  *=  9.8  mm,  dyj  =  0.052  mm  and  dR  =  0.037  mm.  The  2R  value  allows  um  =  26  cy/mm 
input  spatial  frequencies  (assuming  f^i  =  400  mm) .  These  dyj  and  dR  choices  satisfy  (16)  and 
(17)  with  the  detector  size  used  sj  =  1.5  mm  (since  hp  =  5.4  mm  and  hw  =  3.8  mm).  The  i-tb 
grating  angle  is  6 ( i )  *  (i— 0 .5)18  from  (19).  These  values  also  satisfy  (20)  for  P2.  FT 
plane  spots  above  82  »  0.5  mm,  corresponding  to  uniform  input  spatial  frequency  regions  as 
small  as  Xfu/s2  *  0.4  mm  for  fLi  =  400  mm.  The  light  intensity  in  the  two  sets  of  concen¬ 
tric  diffracted  peaks  (Figure  7B)  vary  (for  the  outer  set  of  ring  elements)  due  to  the  area 
of  the  rings  and  the  varying  number  of  grating  lines  per  ring  (27  to  266  lines)  . 

6.  ADVANCED  TOPICS 

The  resolution  (0.006”  *=  0.15  mm)  for  the  Calcomp  plotter  used  in  these  experiments 
determines  the  system's  size  and  the  number  of  wedge  and  ring  elements  used.  Commercially 
available  recorders  with  1  pm  resolution  [5]  allow  fabrication  of  a  system  of  significantly 
reduced  size.  For  the  system  of  Figure  4  with  X  =  820  ym  (a  laser  diode  source) ,  a  14  x  14 
mm  detector  array  and  a  maximum  input  spatial  frequency  of  20  cy/mm,  both  lenses  can  have 
f L  *  4 "  *  100  mm  using  a  dm  “  2  um  minimum  spot  size  recorder.  This  represents  a  consider¬ 
able  reduction  in  system  size. 
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(A)  (B) 

FIGURE  7 

(A)  Computer  generated  hologram  and  (B)  output  plane  pattern  for  a  wedge  ring  detector 
computer  generated  hologram  with  concentric  output  detector  arrays 


The  light  budget  for  this  system  is  excellent.  Assuming  lens  transmittances  of  0.9,  a 
Pj  transmittance  of  0.5,  a  transmittance  of  0.5/20  for  each  of  the  20  wedge  and  ring  elements 
in  P2  (where  a  50%  efficiency  for  a  bleached  or  phase  CGH  is  assumed) ,  a  P3  transmittance  of 
0.5  (each  detector  intercepts  approximately  0.5  of  the  total  light  in  the  sine  function,  and 
10  different  input  spatial  frequencies,  and  hence  a  division  of  the  total  light  into  ten 
separate  regions),  then  the  system's  transmittance  to  one  output  detector  is  0 . 1 ( 0 . 9) 2 ( 0 . 5) 
(0.1) (0.9) 2 (0.5) 3(0.05)  =  5  x  10~4  =  0.05%.  For  a  typical  detector  with  0.5  amp/watt  sen¬ 
sitivity  and  0.3  nA  dark  current,  an  input  light  intensity  of  6  x  10-10  W  corresponds  to  the 
dark  current  and  the  maximum  input  light  is  6  x  10~4  W  (for  a  6  decade  response  detector) . 

If  the  photodiode  is  biased  at  300  nA  (much  much  greater  than  the  dark  current) ,  the  minimum 
detector  power  required  is  0.6  yW  and  hence  we  require  only  0.6  yW/(5xlO~4)  =1.2  mW  of  in¬ 
put  light.  This  is  easily  achieved  by  laser  diode  sources. 

A  final  topic  is  the  converging  nature  of  the  light  input  to  P2  in  Figures  3  or  4 .  In  a 

converging  beam  FT  system  as  is  used,  the  FT  is  formed  on  a  spherical  surface  not  in  a  plane 

[101.  The  displacement  error  to  the  plane  where  the  CGH  is  placed  is  Az  =  (x2^+y2  ' * 
(X2^+Y22) /fLl  is  onlY  a  maximum  of  0.25  mm.  Thus,  the  FT  spot  size  is  only  slightly  larger 

than  the  theoretical  value  and  the  pattern  detected  at  P3  is  correct.  Phase  curvature  at  P3 

is  of  no  concern  since  the  size  of  the  detectors  are  used  (in  our  design) .  If  (as  occurs  in 
practice)  only  a  small  part  of  a  wedge  or  ring  is  illuminated  at  P2,  the  Az  effect  is  of  no 
concern  since  all  of  the  light  still  easily  falls  within  a  wedge  or  ring  region. 

We  have  concentrated  on  the  use  of  a  WRD  CGH  in  the  FT  plane.  However,  as  noted  earlier, 
a  WRD  can  also  be  used  in  the  autocorrelation  plane  to  produce  chord  distribution  functions. 
As  noted  earlier,  one  is  not  restricted  to  wedge  and  ring  shaped  detector  elements  but  can 
employ  other  detector  shapes  as  required.  This  is  attractive  both  for  FT  plane  sampling  and 
for  autocorrelation  plane  analysis.  The  use  of  CGHs  clearly  allows  considerable  flexibility 
in  the  detection  process  and  it  allows  separation  of  the  detector  shape  function  from  the 
detection  function,  thereby  allowing  more  optimized  components  to  be  used.  In  this  paper, 
the  general  concepts  of  the  CGH  detector  have  been  advanced,  general  calculations  and  design 
rules  have  been  advanced  and  laboratory  demonstrations  and  designs  of  two  different  WRD  CGHs 
have  been  provided. 
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6.  MULTIPLE  FEATURE  EXTRACTORS 
AND  CLASSIFIERS:  AN  OPTICAL, 
WEDGE  RING  DETECTED  FOURIER 
TRANSFORM  SPACE  CASE  STUDY 


Feature  extractors  for  distortion-invariant  robot  vision 


David  Casasent 
Vinod  Sharma* 

Carnegie-Mellort  University 
Department  of  Electrical  and 
Computer  Engineering 
Pittsburgh,  Pennsylvania  15213 


Abstract.  Various  feature  extractors/classifiers  for  a  hierarchical  feature- 
space  pattern  recognition  system  are  described.  The  system  is  intended  to 
achieve  multiclass  distortion-invariant  object  identification.  Although  only  a 
Fourier  transform  feature  space  is  used,  our  basic  hierarchical  concepts,  our 
theoretical  analysis,  and  our  general  conclusions  are  applicable  to  other  feature 
spaces.  The  performance  using  intensity  and  phase  Fourier  transform  features 
and  the  performance  in  the  presence  of  noise  are  studied  and  quantified  for  two 
different  two-class  pattern  recognition  data  bases. 

Keywords:  robot  vision,  dimensionality  reduction :  feature  extraction:  Fourier  transform: 
optical  data  processing:  optical  pattern  recognition. 
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1.  INTRODUCTION 

Distortion-invariant  multiclass  pattern  recognition  is  considered 
using  a  Fourier  transform  (FT)  feature  space.  Feature  extraction, 
dimensionality  reduction,  discrimination,  and  classification  are  ad¬ 
dressed  A  simplified  block  diagram  of  our  hierarchical  pattern 
recognition  system  is  shown  in  Fig.  I.  We  begin  with  a  Fourier 
transform  feature  space,  since  such  a  representation  is  well  know  n1  to 
allow  significant  data  compression.  We  extract  the  magnitude,  phase, 
or  both  from  the  Fourier  transform  plane.  As  the  first  dimension¬ 
ality-reduction  technique,  we  use  a  wedge-ring  detector  (WRD)  to 
sample  the  Fourier  transform  plane  data-'  '  to  reduce  the  dimension¬ 
ality  of  (he  feature  space  and  retain  only  the  dominant  eigenvector 
for  each  object  class.  I  his  reduced  subspace  is  calculated  using  a 
Karhunen-l.oevefKl  )  transformation4  by  new  efficient  techniques.' 
I  his  completes  the  dimensionality  -reduction  step  in  our  system.  To 
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provide  discrimination,  we  employ  two  nonunitary  transformations: 
the  Fukunaga-Koontz  (FK)6  and  the  Foley-Sammon  (FS)7  trans¬ 
formations.  Our  classifier  then  selects  the  best  subspace  from  the  K  L. 
FK,  and  FS  feature  vectors. 

In  Sec.  2,  we  review  and  highlight  our  two  levels  of  dimensionality 
reduction  (WRD  Fourier  transform  sampling  and  dominant 
eigenimage  calculation).  We  then  discuss  (Sec.  3)  how  we  achieve 
distortion  invariance,  and  we  detail  the  discrimination  algorithms 
used.  Brief  theoretical  remarks  on  the  use  of  Fourier  transform  plane 
phase  or  magnitude  features  and  on  the  noise  performance  of  a 
feature  extractor  then  are  advanced  in  Sec.  4.  The  two  image  data 
bases  used  in  our  experiments  and  the  results  of  our  initial  dominant 
eigenimage  feature  vector  calculations  are  summarized  in  Sec.  5. 
More  extensive  distortion-invariant  image  test  results  are  then  pre¬ 
sented  and  discussed  in  Sec.  6.  These  results  include  a  comparison  of 
the  performance  of  our  system  for  five  different  discrimination  vec¬ 
tors.  comparison  of  the  performance  of  amplitude-only  and  phase- 
only  Fourier  transform  features,  and  a  comparison  of  the  classifiers 
and  feature  extractors  in  the  presence  of  noise.  Our  summary  and 
conclusions  then  arc  advanced  in  Sec.  7. 


2.  DIMENSIONALITY  REDUCTION  AND  DISTORTION 
INVARIANCE 

If  the  input  image  or  object  is  256X256  pixels,  its  dimensionality  is 
n  =  2562.  The  discrete  Fourier  transform  plane  for  such  an  object 
still  has  a  dimensionality  of  n.  This  is  quite  prohibitive  for  subse¬ 
quent  feature  extraction,  matrix  transformations,  or  other  similar 
operations.  Thus,  dimensionality-reduction  techniques  are  essential 
operations  that  must  be  applied  to  such  a  feature  space.  A  Fourier 
transform  feature  space  is  a  most  useful  representation  of  structural, 
resolution,  and  orientation  information  on  the  input  object.  Such  a 
feature  space  is  also  attractive  since  physical  insight  about  the  input 
object  is  easily  obtained  from  this  feature  space.  Such  a  feature  space 
is  well  known1  to  lend  itsell  easily  to  dimensionality  reduction.  I  hese 
reasons,  plus  the  ease  with  which  such  a  leature  space  can  be  pro¬ 
duced  optically  ( using  a  simple  spherical  lens)  or  digitally  (by  various 
fast  Fourier  transform  hardware  and  algorithms),  make  this  an  ideal 
choice  for  our  hierarchical  feature-extraction  studies 
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FEATURE  EXTRACTORS  FOR  DISTORTION-INVARIANT  ROBOT  VISION 


Fig.  1 .  General  Fourier  transform  (etc.)  feature-extraction  pattern  recogni¬ 
tion  system  block  diagram. 


As  the  first  level  of  dimensionality  reduction,  we  sample  the 
Fourier  transform  plane  with  a  WRD.  If  an  optical  system  is  used  to 
produce  the  Fourier  transform,  a  commercial  WRD  device  exists.' 
This  unit  consists  of  32  wedge-shaped  detector  elements  in  one-half 
of  a  circular  detector  and  32  annular-shaped  detector  elements  in  the 
other  half  of  the  detector  plane.  This  device  thus  provides  64  W  R  D 
outputs  and  hence  reduces  the  dimensionality  of  the  Fourier  trans¬ 
form  feature  space  from  n  =  2562  to  64.  One  also  can  digitally 
model  such  a  device,  of  course.  The  ring  detector  elements  provide 
rotation  invariance,  whereas  the  wedge  detector  elements  provide 
scale  invariance  (if  the  values  of  the  wedge-ring  detector  element 
readings  are  properly  normalized  for  object  energy).2  3  To  see  this, 
we  first  recall  that  the  magnitude  of  the  Fourier  transform  is  shift- 
invariant.  Next,  we  note  that  as  the  scale  of  the  input  object  changes, 
the  two-dimensional  Fourier  transform  distribution  changes  radially 
(inversely  with  the  scale  change  of  the  input  object).  Thus,  the 
outputs  of  the  wedge-shaped  detector  elements  will  remain  invariant 
to  such  input  object  scale  changes.  Finally,  we  recall  that  the  orienta¬ 
tion  of  the  two-dimensional  Fourier  pattern  rotates  as  the  input 
object  rotates.  Thus,  the  ring-shaped  Fourier  plane  sampling 
elements  have  outputs  that  remain  invariant  to  in-plane  rotations  of 
the  input  object.  These  remarks  follow  for  the  case  of  a  real  and 
positive  input  function,  whose  Fourier  transform  is  symmetric.  This 
situation  applies  for  the  case  of  images,  and  thus  the  two  halves  of  the 
Fourier  plane  can  be  separately  sampled  as  described  with  no  infor¬ 
mation  loss. 

This  WRD  sampling,  plus  the  training  of  our  system  on  different 
distorted  images,  provides  a  distortion-invariant  pattern  recognition 
algorithm.  In  this  and  similar  feature  space  approaches  to  pattern 
recognition,  one  uses  N  , images  in  class  1  and  N2  images  in  class  2  to 
determine  the  parameters  of  the  processor.  These  are  referred  to  as  a 
training  set  of  images.  Each  of  the  i  images  per  class  is  denoted  by  a 
vector,  with  {xjl  and  {yj  (  being  the  set  of  i  =  N,  or  N2  training  set 
image  vectors  for  a  two-class  example.  The  corresponding  two- 
dimensional  Fourier  transforms  are  the  n-dimensional  vector  sets 
I  x" )  and  {y"j.  These  are  dimensionality-reduced  to  the  WRD- 
sampled  64-dimensional  vector  sets  {xjl  and  jyjj.  As  the  second 
dimensionality-reduction  step,  we  apply  a  KL  transformation4  to  the 
autocorrelation  matrix  formed  from  the  WRD  feature  vectors  for 
each  separate  object  class.  The  autocorrelation  matrix  is  formed 
from  the  64  element  xj  vectors  for  each  of  the  training  set  images  {  x  | 
in  class  I ,  and  a  second  matrix  is  formed  from  the  corresponding  yj 
vectors  of  images  in  class  2.  The  eigenvalues  and  eigenvectors  of  each 
matrix  are  calculated  and  tabulated.  One  can  then  retain  the  domi¬ 
nant  tjx  and  17,.  eigenvectors  per  class,  where  rjx  and  r;,.  are  typically 
less  than  4  in  our  experiments,  we  retained  only  the  dominant 
eigenvector  for  classes  I  and  2,  which  we  denote  by  KL-I  and  KL-2. 
In  practice,  two  or  three  eigenvectors  would  be  used  per  class. 

3.  NONUNITARY  TRANSFORMATIONS 

To  use  these  dominant  eigenvectors  defined  in  Sec.  2  for  classifica¬ 
tion.  we  compute  i  for  an  unknown  input  object  vector  z,  project  it 
onto  the  eigenvectors  Kl.-I  and  K  1.-2  (for  classes  I  and  2,  respec¬ 
tively),  and  select  the  class  for  the  unknown  input  based  upon  which 


projection  value  is  larger.  T  he  Kl..  or  dominant,  eigenvector  trans¬ 
formation  in  Sec.  2  represents  a  considerable  compression  of  data 
and  simplifies  performing  the  nonumtary  transformations  discussed 
below.  The  dominant  eigenvectors  represent  each  class  well  in  an 
optimal  compressed  manner  However,  there  is  no  assurance  that 
those  features  that  represent  each  class  well  will  be  optimal  for 
discriminating  one  class  from  another.  T  hus,  dominant  eigenvectors 
are  useful  for  intraclass  pattern  recognition  (that  is,  recognizing 
different  versions,  i.e.,  geometrically  distorted  views  of  one  object), 
but  not  necessarily  for  interclass  discrimination  (distinguishing  one 
object  class  from  another).  In  a  hyperspace  description  of  a  feature 
vector  and  a  discriminant  vector,  unitary  transformations  do  not 
change  the  distances  between  vectors.  To  achieve  discrimination  or 
interclass  pattern  recognition,  linear  nonunitary  transformations 
represent  an  attractive  approach.  These  transformations  can  change 
interclass  distances  and  hence  provide  improved  discrimination.  We 
pursued  this  approach  rather  than  utilizing  additional  eigenvectors 
per  object  class.  This  choice  is  logical  since  the  use  of  more  eigenvec¬ 
tors  would  only  further  increase  the  dimensionality  and  computa¬ 
tional  complexity  of  the  processor.  In  the  next  two  subsections,  we 
detail  two  nonunitary  transformations  that  we  have  employed. 

3.1.  Fukunaga-Koontz  transformation 

The  first  nonunitary  transformation  we  consider  is  the  Fukunaga- 
Koontz  (FK)  transformation.''  To  describe  the  steps  in  this  algo¬ 
rithm,  we  first  define  Pj  as  the  a  priori  probability  for  class  i  and  Rj  as 
the  autocorrelation  matrix  for  class  i.  We  form  the  autocorrelation 
matrices  R|  and  R2  for  each  class,  where  Rj  =  PjRj,  and  we  form  the 
full  autocorrelation  matrix  R  =  R,  +  R2.  We  then  determine  the 
transformation  matrix  T  that  diagonalizes  R;  i.e.. 


TRTt  =  T(R,  +  R2)Tt  =  1  ,  (I) 

where  I  is  the  identity  matrix.  By  this  transformation  we  have 
orthogonally  decomposed  the  full  R,  +  R2  matrix.  Next,  we  apply 
T  to  R,  and  R2;  i.e..  we  form  new  matrices  for  each  class  given  by 
T  RjT‘  and  T  R2Tt. 

These  new  correlation  matrices  have  two  attractive  features: 
(a)  The  eigenvectors  jki,l)  and  ^j|2)  of  T  R,T'  and  T  R1T1  are  the 
same,  (b)  The  eigenvalues  A,1 11  and  Aj,2)  associated  with  1 3  and  1/1/ 2 1 
are  related  by 

V"  =  I  -  A,12’  •  (2) 

From  Eq.  (2).  we  see  that  the  dominant  eigenvectors  of  the  trans¬ 
formed  class  I  matrix  are  the  least-dominant  eigenvectors  for  the 
transformed  class  2  matrix.  Thus,  those  eigenvectors  that  represent 
class  I  the  least  represent  class  2  the  best  (in  the  new  FK  transformed 
feature  space).  Thus,  this  transformation  has  converted  the  input 
data  into  a  new  space  with  a  common  set  of  basis  functions  (the  ^). 
In  this  new  space,  the  data  in  the  two  classes  are  now  separated.  In 
our  two-class  problem,  we  will  select  two  ^r,  with  the  largest 
|  A,1 11  —  0.5 1  values. 

Since  R  is  formed  from  the  Kl.  vectors  (Sec.  2)  and  since  we  only 
retain  one  KL  eigenvector  per  class,  the  rank  of  R  is  two  and  there  are 
only  two  eigenvectors  We  denote  these  two  eigenvectors  of  the  FK 
transformed  data  by  FK-I  and  FK-2.  FK-I  and  FK-2  are  the  two 
vectors  that  best  discriminate  class  1  objects  from  class  2  objects.  To 
use  these  new  discriminant  vectors  to  determine  the  class  of  an 
unknown  input  image  z.  we  form  the  WRD  vector  z  and  transform  it 
to  a  new  Tz’  =  This  transforms  the  input  data  to  the  new  FK 
space.  We  then  project  1"  onto  an  FK  discriminant  vector  1/1  by 
calculating  1"  =  d.  Depending  upon  whether  d  is  above  or  below 
a  threshold,  we  select  class  I  or  class  2  for  the  class  of  the  input  object. 
Wc  normalize  the  FK  vectors  and  refer  to  the  projections  onto  the 
FK  directions  I  and  2  (corresponding  to  FK-I  and  FK-2).  Wc  note 
that  FK-I  and  FK-2  do  not  refer  to  discriminant  vectors  for  classes  I 
and  2,  rather,  they  refer  to  the  two  most  dominant  eigenvectors  of  the 
transformed  full  autocorrelation  matrix  of  both  classes. 
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3.2.  Foley-Sammon  transformation 

In  the  Foley-Sammon  (FS)  nonunitary  transformation,’  we  find  a 
linear  discriminant  vector  w.  selected  to  maximize  the  Fisher  ratio": 


complexities  in  extracting  the  magnitude  or  the  phase  of  the  Fourier 
transform  are  more  comparable.  Optically,  the  Fourier  magnitude  is 
easily  obtained,  whereas  its  phase  requires  the  use  of  a  more  compli¬ 
cated  heterodyne  detection  technique. 


F(w) 


(difference  of  means  of  projections)2 
sum  of  variances  of  projections 


(3) 


In  terms  of  the  means  m,  and  m,  of  the  projections  for  class  I  and 
class  2  training  set  objects  onto  w  and  the  scatter  s,  and  s;  of  these 
projections,  we  can  write 


F(w) 


I  m,  -  m2|2 

2  j  2 
s!  +  s2 


wrSBw 

wrSww 


(4) 


where  SB  is  the  between-class  scatter  matrix  and  Sw  is  the  within- 
ciass  scatter  matrix."  The  solution  for  w  that  maximizes  Eq.  (4)  is 


w  =  Su'(iti|  ~  m2)  .  (5) 

where  m,  and  mi  are  the  vector  means  of  the  two  classes.  To  use  wfor 
an  unknown  input  z\  we  form  wTz"  =  d  and  compare  the  projection 
value  to  the  threshold  T,  where 


(m,  +  mi) 

T  =  - — —  ■  (6) 

If  d  >  T.  we  select  class  I .  If  d  <  T.  we  select  class  2  for  the  class  of 
the  unknown  input  image  vector  *. 

4.  INTENSITY-ONLY  OR  PHASE-ONLY  FOURIER 
TRANSFORM  FEATURES 

An  attractive  aspect  of  a  Fourier  transform  feature  space  is  the  fact 
that  its  magnitude  or  phase  or  both  can  be  used.  Considerable 
work*  10  exists  on  the  representation  of  image  data  by  the  intensity  or 
phase  of  the  Fourier  transform.  In  general,  the  conditions  under 
which  the  Fourier  transform  phase  features  are  adequate  are  less 
restrictive  than  the  conditions  under  which  the  Fourier  transform 
magnitude  features  arc  adequate.  The  magnitude  of  the  Fourier 
transform  is  adequate  if  the  z-transform  does  not  contain  reciprocal 
pole-zero  pairs,  poles  outside  the  unit  circle,  or  zeros  inside  the  unit 
circle. 

This  prior  work  has  been  concerned  with  aesthetically  pleasing 
image  reconstructions  from  the  magnitude  or  phase  of  the  Fourier 
transform.  However,  our  present  concern  is  object  recognition,  not 
image  reconstruction.  Little  research  exists  on  this  topic.  In  our  case 
studies,  we  will  wedge-ring  detect  and  KL  transform  the  Fourier 
transform  magnitude  or  phase  data  (or  a  combination  of  both).  We 
will  then  quantify  the  pattern  recognition  performance  of  magnitude 
or  phase  features  and  their  performance  in  the  presence  of  noise.  The 
Fourier  magnitude  data  are  shift-invariant,  and  thus  the  location  of 
the  object  in  the  input  field  of  view  cannot  be  determined  from  such 
data  Conversely,  the  linear  components  of  the  Fourier  phase  provide 
data  on  the  location  of  the  input  object.  Digitally,  the  computational 


5.  DATABASES 

The  four  image  data  bases  used  are  summarized  in  Table  I.  They 
include  scaled  and  rotated  images  of  the  letters  A  and  B  and  of 
hand-drawn  images  of  tanks  and  trucks.  For  each  of  these  two  object 
classes,  we  used  a  set  of  five  images  per  class  and  a  set  of  25  images  per 
class.  Various  scaled  and  rotated  views  were  included  in  each  of  these 
image  sets.  A  scale  value  of  1 .0  is  unity  scale,  and  0.9  corresponds  to  a 
\0c'c  scale  difference,  etc.  The  specific  distorted  object  views  included 
in  each  case  are  detailed  in  Table  1.  All  images  have  16  gray  levels, 
with  the  1.0  nominally  scaled  images  having  various  numbers  of 
pixels:  A  (584  pixels),  B (375  pixels),  tank  (797  pixels),  and  truck  (292 
pixels).  For  our  noise-free  tests,  these  images  were  present  on  a 
zero-valued  background.  For  our  noise  tests,  zero-mean  white  Gauss¬ 
ian  noise  was  added  to  all  pixels  in  all  images.  In  our  data,  we  list  the 
standard  deviation  on  of  the  noise.  From  on,  the  total  number  of 
pixels  N  in  the  image,  and  the  object  energy  E(the  sum  of  the  squares 
of  the  pixel  values  for  the  object),  an  input  signal-to-noise  ratio 
SNR,  =  E/  No2  can  be  defined.  For  N  =  Kf’.E  =  I04  (400  pixels 
of  average  value  5),  and  on  =  0.4,  a  small  SNR,  =  6.25  results. 


TABLE  I.  Summary  of  Experimental  Image  Data  Bases  Used 


Test 

5-Image  data  base 

25-Image  data  base 

sets 

Scales 

Rotations 

Scales 

Rotations 

A  and  B 

0.9,  10,  1.1 

0°,  10° 
(for  0  9  and 
1.1  scales) 

0.8,  0  9.  10, 
1.1,  1.2 

±10°,  ±5°,  0° 

(for  each  scale) 

Hand-drawn 
truck  and 
tank 

0  9,  1.0.  1  1 

0°,  10° 
(for  0  9  and 
1.1  scales) 

0  8,  0  9,  1  0, 
1.1,  1.2 

±10°,  ±5°,  0° 
(for  each  scale) 

In  Table  II,  we  list  the  five  nonzero  eigenvalues  for  the  five-image 
data  base  for  all  four  object  types  and  for  both  magnitude  and  phase 
Fourier  transform  features.  As  seen,  the  eigenvalue  for  the  dominant 
eigenvector  for  magnitude  Fourier  transform  features  is  approx¬ 
imately  70  times  the  second  dominant  (in  general).  This  is  more 
pronounced  for  the  letters  A  and  B.  The  eigenvalue  for  the  dominant 
eigenvector  for  the  letter  A  obtained  from  Fourier  transform  phase 
data  is  low  (0.67).  Because  of  the  lower  (0.67)  eigenvalue,  we  may 
expect  lower  projection  values  and  hence  more  errors  in  our  pattern 
recognition  of  letters  using  phase  features.  In  general,  the  dominance 
of  one  eigenimage  in  the  magnitude  data  may  be  attributed  to  the  fact 
that  the  image  data  base  consists  of  scaled  and  rotated  (in-plane 
rotation)  images  rather  than  different  aspect  views  of  each  object.  In 
such  distorted  images,  there  is  no  appreciable  new  information  pres¬ 
ent  in  each  object  representation  A  possible  reason  for  the  lower 
dominant  eigenvalues  for  phase  features  may  be  the  reduced  accuracy 


TABLE  II.  Eigenvalues  of  the  Five  Nonzero  Eigenvectors  of  the  WRD  Fourier  Transform  Data  for  the  Five-Image  Data  Base 


No 

Truck 

Tank 

A 

B 

Magnitude 

Phase 

Magnitude 

Phase 

Magnitude 

Phase 

Magnitude 

Phase 

1 

0  98 

0  99 

0  98 

0  89 

0  99 

0  67 

0  99 

0  95 

2 

0  17X10  1 

0  78X10  2 

0  17X10  1 

098X10  1 

0.71  X10  2 

0  24 

013X10  ' 

0  43  x  10  ' 

3 

0  82X10  4 

028X10  3 

0  21  X10  3 

012X10  1 

0  84X10  4 

0  72  X10  ’ 

0  49X10  3 

0  19x10  2 

4 

081  X10  6 

0 12/10  3 

064X10  4 

0  24X10  2 

047X10  4 

0  11/10  ' 

0  29x  10  4 

0  77  s 10  3 

5 

049X10  6 

0 11  X10  4 

0  11  X  10  5 

0 14X10  2 

0  65>  10  5 

038X10  2 

0 17v io  4 

0  70  *  10  4 
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FEATURE  EXTRACTORS  FOR  DISTORTION -INVARIANT  ROBOT  VISION 
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X  a=1.0 
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Fig.  2.  Magnitude-only  WRD  Fourier  transform  features  projected  onto 
dominant  tank/truck  eigenvectors  (for  25-image  deta  base).  KL1  =  domi¬ 
nant  truck  eigenvector;  KL2  =  dominant  tank  eigenvector 


associated  with  the  nonlinear  arctangent  operation  required  to  com¬ 
pute  the  phase  of  the  Fourier  transform. 

The  eigenvalue  data  for  the  25-image  data  base  showed  compara¬ 
ble  results  to  those  in  Table  II.  As  noted  in  Sec.  2,  we  retained  only 
the  dominant  eigenvector  per  class  for  magnitude-only  and  phase- 
only  data.  For  magnitude  features,  we  expect  the  second  dominant 
eigenvector  to  provide  poor  discrimination  (this  was  found  to  be  the 
case  from  experiments).  For  the  tank  and  truck  data,  phase  features 
may  be  expected  to  perform  comparably  and  possibly  better  than 
magnitude  features.  In  experiments,  phase  features  (for  the  tank  and 
truck  data)  using  one  dominant  eigenvector  per  class  consistently 
gave  larger  projection  ratios  than  magnitude  features.  For  the  letters 
A  and  B,  phase  features  performed  poorly  (as  expected,  since  the 
dominant  eigenvalue  is  smaller).  Including  the  second  dominant 
eigenvector  for  the  phase  features  for  our  letter  recognition  tests 
would  be  expected  to  improve  performance.  However,  we  included 
only  the  most  dominant  eigenimage  per  class.  Our  extensive  test 
results  obtained  with  the  25-image  data  set  are  detailed  in  Sec.  6. 
They  follow  the  trends  noted  above,  which  are  expected  from  the 
data  in  Table  II. 


6.  INITIAL  EXPERIMENTAL  RESULTS 
6.1.  KL  transformations 


All  of  the  results  included  in  this  section  were  obtained  on  our  more 
extensive  data  base  of  25  object  images  per  class.  In  Fig.  2,  we  show 
the  scatter  plots  for  the  projections  of  all  lank  and  truck  images  onto 
the  dominant  eigenvector  for  tanks  (KL- 1  land  for  trucks  (K  1.-2).  As 
seen,  all  images  can  be  separated  and  correctly  classified  from  either 
projection  alone.  However,  all  projection  values  (even  those  on  the 
dominant  eigenvector  of  the  other  class)  are  quite  large  (all  projec¬ 
tion  values  are  above  0.95).  This  might  be  expected  since  the  Kl. 
eigenvectors  are  useful  only  for  intraclass  recognition,  not  intcrclass 
discrimination  Figure  2  shows  that  the  projections  of  the  truck 
images  on  the  dominant  truck  eigenvector  KI.-I  yield  essentially 
invariant  values  ( ==0.993 ).  The  tank  images  projected  onto  the  tank 
eigenvector  KL-2  show  a  similar  invariance  with  all  projection  values 
^0  995.  This  intraclass  invariance  is  expected  (because  of  the  domi¬ 
nance  of  the  first  eigenimage  in  each  class)  by  the  nature  of  the  KL. 
transform.  From  Figure  2.  we  can  also  assess  the  interclass  discrimi¬ 
nation  of  dominant  KL  eigenvectors.  The  truck  projections  on  KI.-I 
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Fig.  3.  Magnitude-only  WRD  Fourier  transform  feature  projections  for 
tank/truck  images  onto  the  two  FK  vectors  (25  images/class). 


yield  lower  (0.97  to  0.99)  projections  (versus  0.995  for  projection  on 
KL-2).  The  truck  images  show  similarly  lower  projections  on  KL-2 
compared  to  KL-I.  However,  the  large  range  for  all  projections  (all 
are  above  0.95)  makes  the  performance  of  this  system  in  noise 
suspect.  These  results  thus  verify  the  intraclass  recognition  ability  of 
the  KL  transform.  If  the  two  classes  are  sufficiently  different,  inter¬ 
class  discrimination  will  be  good,  but  the  KL  algorithm  does  not 
guarantee  this.  For  the  case  shown,  either  eigenvector  alone  is  suffi¬ 
cient  for  discrimination.  However,  this  is  not  a  general  conclusion. 

An  interesting  trend  from  the  data  of  Fig.  2  is  that  only  five  points 
exist  for  the  25  truck  images.  These  correspond  to  the  five  different 
input  image  scales  (denoted  by  the  values  of  the  parameter  cr.  as 
shown)  with  all  five  rotated  views  per  scale  giving  the  same  projec¬ 
tion.  This  occurs  since  rotated  images  at  the  same  scale  have  the  same 
energy,  whereas  different  scaled  images  have  different  energy.  With  a 
different  normalization  of  the  image  data  base,  the  projection  values 
for  different  object  scales  could  be  made  to  coincide.  This  effect  is 
most  pronounced  for  the  truck  images  since  they  all  contain  signifi¬ 
cantly  fewer  pixels  than  any  scaled  tank  image  used  and  KL-2  is 
normalized  for  the  tank  images  alone. 

Similar  results  were  obtained  for  the  projections  of  the  images  of 
the  letters  A  and  B  onto  their  dominant  eigenvectors  (for  magnitude- 
only  Fourier  data).  These  results  did  not  exhibit  as  pronounced  a 
variation  with  the  scale  of  the  input  image  (since  both  letters  contain 
a  comparable  number  of  pixels).  The  data  still  exhibited  the  five 
clusters  of  projection  values  (one  cluster  per  scale,  with  only  small 
variations  due  to  rotation)  for  the  reasons  advanced  above.  All 
projection  values  for  the  letters  were  quite  large  and  even  more 
clustered  than  in  the  tank  data  (all  letter  projections  were  above 
0.998).  More  advanced  techniques  are  clearly  warranted,  and  thus  we 
next  experimentally  considered  our  nonunitarv  transformations. 


6.2.  Nonunitary  transformations 

The  projections  of  the  truck-tank  data  base  images  on  the  FK-I  and 
FK-2  feature  vectors  are  shown  in  Fig.  3.  Comparing  these  results 
with  the  corresponding  projection  data  on  the  KL-I  and  KL-2  eigen- 
images  (Fig.  2).  we  see  that  the  FK-I  and  FK-2  feature  vectors 
separate  these  two  image  classes  much  more  than  do  the  KL-1  and 
KL-2  eigenimages.  This  verifies  our  remark  that  the  FK  feature 
vector  direction  that  represents  one  class  best,  represents  the  second 
class  worst,  and  that  FK  transformations  are  preferable  for  discrimi¬ 
nation.  whereas  KL  or  dominant  eigenvector  projections  provide 
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TABLE  III.  Comparison  of  Separability  Measure  S  for  Magnitude  and  Phase  Features  for  Different  Case  Studies  and  Different  Feature  Extractors 


FT  data 
•mages 

Magnitude  only 

Phase  only 

Magnitude  and  phase 

Truck  -  Tank 

A  -  6 

Truck  -  Tank 

CD 

1 

< 

Truck  -  Tank 

A-B 

S  for  KL-1 

4  130 

7.087 

5.681 

1,326 

7.147 

0254 

S  for  KL-2 

2  898 

5.984 

4  596 

8  419 

8  761 

5  535 

S  for  FK-1 

3  908 

12.135 

5.450 

0201 

6  285 

0226 

S  for  FK-2 

3  879 

12.131 

4.253 

9.371 

7  765 

12  880 

S  tor  FS 

4  504 

1 1 .898 

7.578 

9.428 

8  541 

12  620 
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Fig.  4.  Magnitude-only  WRD  Fourier  transform  feature  projections  for 
tank/truck  images  on  the  best  Foley-Sammon  (FS)  vector  (26  images/ 
class)  For  clarity  of  presentation,  the  projection  values  are  shown  dis¬ 
placed  from  the  FS  axis. 


only  intraclass  recognition.  Now  we  notice  a  variation  in  the  projec¬ 
tion  value  along  both  the  FK-I  and  FK-2  axes  (in  Fig.  2.  only 
variations  in  the  projections  on  the  dominant  K  L.  eigenvector  for  the 
opposite  class  were  observed).  Variations  in  both  axes  occur  here 
since  each  FK  feature  vector  is  a  linear  combination  of  the  KL-1  and 
KL-2  eigenvectors.  Figure  4  shows  the  projection  values  obtained  by 
projecting  the  truck-tank  image  data  base  onto  the  FS  feature  vector. 
The  projection  values  now  appear  to  be  separated  more  than  those 
for  the  KL  projections,  but  less  than  for  the  FK  projections.  A 
quantitative  performance  measure  for  comparing  these  different 
feature  extractors  is  now  advanced 


6.3.  Performance  measure 


From  Fig.  3  (compared  to  Fig.  2).  the  difference  in  the  expected 
values  of  the  projections  of  the  two  classes  of  data  onto  the  FK 
feature  vectors  is  larger  than  for  the  projections  onto  the  dominant 
KL.  eigenvectors  per  class.  However,  the  variance  is  also  larger  in  the 
f-  K  projection  case.  The  same  general  conclusions  also  hold  for  the 
A-B  image  recognition  data.  The  scatter  plots  in  Figs.  2  to  4  are 
useful  for  visually  conveying  results.  However,  they  are  misleading 
since  they  bias  one  to  favor  a  feature  extractor  that  yields  larger 
differences  in  the  mean  values  for  the  projections  of  different  data 
classes. 

To  more  properly  compare  different  feature  extractors,  the  actual 
projection  values  (note  the  different  scales  in  Figs.  2  to  4)  and  the 
variances  of  the  projection  values  within  each  class  must  both  be 
considered  To  achieve  this  and  to  quantify  the  performance  of  our 
various  feature  extractors,  we  use  the  separation  measure 


S  = 


difference  of  means  of  projections  per  class 


(7) 


average  standard  deviation  per  class 
The  denominator  in  Eq.  (7)  is  (a,  +  a:)  2,  where  a,  and  o;  are  the 


y.\  v  v.  \ 


standard  deviations  of  the  class  1  and  class  2  projections.  This 
performance  measure  in  Eq,  (7)  is  valid  if  o{  and  o2  are  of  the  same 
order.  For  our  data,  this  was  found  to  be  generally  true.  The  measure 
S  in  Eq.  (7)  was  chosen  for  its  computational  ease  and  because  it  does 
not  match  the  measure  that  any  of  our  feature  extractors  optimizes. 
We  computed  S  for  all  five  feature  extractor  vector  subsets  (KL-I, 
KL-2.  FK-I.  FK-2,  and  FS)  for  both  magnitude  and  phase  Fourier 
transform  data  (and  combined  magnitude  and  phase  Fourier  trans¬ 
form  data)  for  both  image  data  bases  (vehicles  and  letters).  The 
results  are  shown  in  Table  111  and  discussed  below. 


6.4.  Noise-free  performance  comparison 

A  larger  S  value  in  Table  HI  indicates  better  performance.  This  table 
includes  the  S  performance  measures  calculated  using  magnitude- 
only,  phase-only,  and  combined  magnitude  and  phase  Fourier  trans¬ 
form  features. 

Let  us  now  discuss  Table  Ill.  The  phase  features  for  both  image 
pairs  give  larger  S  values  than  do  magnitude  features.  However,  in 
several  cases  they  perform  much  worse  (for  the  letters  A  and  B).  This 
occurs  because  the  dominant  KL  eigenvector  for  A  is  small  (as  in 
Table  II)  for  these  images.  Consistent  performance  improvement 
with  phase  features  is  expected  (and  such  features  appear  preferable) 
if  more  than  one  dominant  phase  eigenvector  is  retained.  Ignoring 
the  phase  feature  data  for  the  letters,  consistent  trends  emerge  from 
Table  III.  Different  results  occur  for  different  image  pair  recogni¬ 
tions,  However,  FS  consistently  performs  best  (or  nearly  so),  with 
FK  always  being  quite  close  and.  surprisingly.  KL-1  being  consis¬ 
tently  good.  If  two  vector  subsets  were  to  be  chosen  for  a  given 
problem,  those  with  the  largest  S  value  in  one  column  would  be 
selected.  Combined  phase  and  magnitude  features  perform  better 
than  either  alone,  but  the  increased  complexity  in  using  both  magni¬ 
tude  and  phase  features  often  yields  only  a  small  improvement. 

Thus,  from  such  noise-free  tests,  Fourier  transform  magnitude 
data  appear  to  perform  well.  (Optically,  Fourier  transform  magni¬ 
tude  data  are  calculated  much  more  easily  and  hence  are  preferable  if 
the  performance  obtained  is  adequate.)  But  phase  data  are  preferable 
(if  their  largest  eigenvector  is  sufficiently  dominant).  If  two  Fourier 
transform  phase  eigenvectors  are  retained,  and  if  Fourier  transform 
phase  data  can  easily  be  calculated,  phase  features  are  preferable.  If 
the  object  classes  being  discriminated  are  sufficiently  different.  KL  is 
adequate.  However,  in  general.  FK  or  FS  is  recommended.  Clearly, 
the  results  are  data-dependent.  Thus,  let  us  consider  the  performance 
of  all  feature  extractors  in  the  presence  of  noise  before  advancing  a 
final  decision. 


6.5.  Noise  performance  comparisons 

To  best  assess  the  performance  of  our  five  feature  extractors,  we 
consider  our  two  case  studies  (vehicles  and  letters  separately)  In 
Table  IV,  we  list  the  calculated  S  value  for  the  vehicle  identification 
tests  for  both  magnitude-only  and  phase-only  Fourier  transform 


data  as  a  function  of  the  standard  deviation  on  of  the  noise  added  to 


the  input  data.  In  Table  V,  similar  data  for  our  letter  identification 
case  study  are  provided.  In  these  tables  we  also  include  the  magni¬ 
tude  of  the  eigenvalue  for  the  dominant  eigenvector  for  the  class  I 
and  class  2  data  (the  reason  for  this  will  be  apparent  shortly). 

In  fable  IV.  we  focus  attention  on  the  S  performance  values 
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FEATURE  EXTRACTORS  FOR  DISTORTION-INVARIANT  ROBOT  VISION 


TABLE  IV.  Eigenvalues  and  Separability  Measure  S  for  the  Truck  and  Tank  Images  for  Different  Noise  Standard  Deviations  and  for  Magnitude- 
Only  and  Phase-Only  Fourier  Transform  Data _ _ 

Truck  and  tank  (magnitude  data)  Truck  and  tank  (phase  data) 

Noise 

standard  00  01  02  0.3  04  00  0.1  0.2  03  04 


deviation 


Dominant 

eigenvalue 

0  995 

0  995 

0.994 

0993 

0  992 

0881 

0457 

0.238 

0  234 

0  243 

of  class  1 

Dominant 

eigenvalue 

0  999 

0999 

0  999 

0  999 

0.999 

0  617 

0.599 

0.558 

0  504 

0.450 

of  class  2 

S  for  KL- 1 

4  215 

4  341 

4  382 

4315 

4.145 

6  169 

4  513 

2  338 

0588 

0416 

S  for  KL-2 

2  893 

2  924 

2  919 

2  874 

2  792 

4  760 

4385 

4.017 

3  669 

3  083 

S  for  FK  1 

3  923 

3  955 

3  952 

3  915 

3,847 

5.624 

3.787 

1  668 

0436 

0429 

S  for  FK-2 

3  894 

3  924 

3  919 

3  878 

3.806 

4  046 

3  786 

3  963 

3  537 

3  110 

S  for  FS 

4  625 

4  705 

4  746 

4.744 

4.699 

7.689 

5  592 

3  995 

3.598 

3  229 

TABLE  V.  Eigenvalues  and  Separability  Measure  S  for  the  Letter  A  and  B  Images  for  Different  Noise  Standard  Deviations  and  for  Magnitude- 
Only  and  Phase-Only  Fourier  Transform  Data 


Noise 

standard 

deviation 

A  and  B  (magnitude  data) 

A  and  B  (phase  data) 

0.0 

0.1 

0.2 

0  3 

0.4 

0.0 

0.1 

0.2 

Dominant 

eigenvalue 

0  999 

0  999 

0998 

0.996 

0994 

0  582 

0.350 

0  198 

of  class  1 

Dominant 

eigenvalue 

0  999 

0999 

0  998 

0  998 

0.997 

0836 

0  600 

0  392 

of  class  2 

S  for  KL-1 

8  515 

9  321 

7  748 

4.580 

2  744 

0  744 

0  382 

0  198 

S  for  KL  2 

6918 

6652 

5049 

3.341 

2.441 

8480 

6.397 

3451 

S  for  FK-1 

13  487 

16  453 

12  344 

6  193 

3.766 

0.259 

0.235 

0072 

S  for  FK  2 

13  475 

16  402 

12  284 

6  172 

3.753 

8  547 

6  143 

3  548 

S  for  FS 

13  320 

17  057 

13  949 

6.752 

4.010 

8.555 

6800 

3480 

obtained  as  on  increases.  Reading  the  performance  measure  data 

magnitude  features,  the  actual  noise  contribution  in  the  important 

horizontally,  we  see  a  negligible  change  in  S  with  o. 

.  for  magnitude- 

Fourier  transform  plane  wedge  and  ring 

elements  is  pro- 

only  data.  Similarly,  the  maximum  eigenvalues  A  for  both  image 

portionally  much  less  than  for  Fourier  transform  phase  features. 

classes  also  vary  only  slightly  with  o 

.  Using  the  phase-only  Fourier 

Hence,  we  might  expect  (as  observed)  poorer 

noise  performance 

transform  features,  we  find  a  quite  significant  decrease  in  S  as  on 

using  phase  features  rather  than  magnitude  Fourier  transform  fea- 

increases.  This  shows  that  the  performance  S  for  phase  features 

tures. 

The  computational  accuracy  associated  with  evaluating  the 

degrades  quite  significantly  as  the  noise  in  the  data  is  increased.  In 

function  from  which  the  nonlinear  phase  features  are  obtained  may 

this  case.  Amax  is  also  reduced  significantly  with  increasing  an  and 
thus  reflects  the  trend  noted  above. 

In  Table  V,  similar  data  are  shown  for  our  letter  recognition  case 
study.  The  magnitude  feature  data  show  a  decrease  in  S  as  on 
increases.  However,  the  decrease  in  S  for  the  phase  features  is  even 
more  appreciable.  Thus,  from  both  Table  IV  and  Table  V  we  find 
that  phase  features  are  a  less  robust  feature  set  than  magnitude 
features  in  the  presence  of  noise. 

Let  us  now  consider  the  reasons  for  the  observed  pcrfomance  in 
Tables  IV  and  V.  We  first  note  that  we  expect  the  Fourier  transform 
magnitude  data  to  be  concentrated  in  several  dominant  spatial  fre¬ 
quencies,  whereas  the  Fourier  transform  phase  data  are  expected  to 
be  more  uniformly  distributed  over  the  Fourier  transform  plane. 
This  is  logical  and  is  the  basis  for  the  success  of  dimensionality 
reduction  using  WRD  Fourier  transform  plane  sampling.  Thus,  with 
Fourier  transform  magnitude  features,  a  few  wedge  or  ring  detector 
elements  dominate  object  identification.  Conversely,  with  Fourier 
transform  phase  features,  all  wedge  or  ring  detector  elements  con¬ 
tribute  more  equally.  Thus,  when  a  given  amount  of  noise  is  present 
in  the  input  image,  it  is  evenly  distributed  over  all  wedge  and  ring 
Fourier  transform  samples  (for  white  noise).  For  Fourier  transform 


be  a  secondary  factor  in  this  observed  noise  performance  for  phase 
features. 


7.  SUMMARY  AND  CONCLUSIONS 

The  classic  Fourier  transform  plane  has  been  considered  as  a  feature 
space  for  distortion-invariant  recognition.  The  use  of  wedge-  and 
ring-sampled  Fourier  transform  plane  features  was  employed  to 
reduce  the  dimensionality  of  the  feature  space  and  to  provide  scale 
and  rotational  insensitivity  in  our  feature  extractor  New  feature 
extraction  algorithms  were  applied  to  these  WRI)  samples  of  the 
Fourier  transform  plane,  and  the  importance  of  magnitude  and 
phase  Fourier  transform  data  for  pattern  recognition  applications 
was  considered.  I  he  performance  of  our  pattern  recognition  system 
for  two  different  two-class  image  data  bases  (vehicles  and  letters)  .'..is 
quantified  lor  all  feature  extractors,  for  phase  and  magnitude  Four¬ 
ier  transform  features,  and  in  the  presence  of  noise 

I  he  feature  extractors  considered  were  the  Karhunen-I  oevc 
dominant  eigenvectors  for  each  class,  the  Fukunaga-Koontz  trans¬ 
formed  discriminant  vectors,  and  the  Foley-Sammon  discriminant 
vector  Extensions  of  all  cases  to  more  than  two-class  pattern  recog- 


OPTICAL  ENGINEERING  /  September/ October  198^  Vol  23  No  5  /  497 


•-  .V  V. 


/  /  * 


•  ^  ^ y'.  s.w*.  C-  A  if.J.  f-  /-  Z-  V-  V-  7- 


CASASENT,  SHARMA 


nition  applications  follow  directly.  For  the  cases  considered,  the 
KL-I  vector  performed  well,  but  the  FK  vectors  were  generally 
better,  and  the  FS  vector  was  almost  always  the  best.  This  follows 
from  the  fact  that  the  KL  technique  prov  ides  only  intraclass  recogni¬ 
tion.  whereas  FK  and  FS  techniques  provide  interclass  discrimina¬ 
tion.  Our  study  of  the  use  of  magnitude  or  phase  Fourier  transform 
features  showed  that  phase  features  were  sometimes  better,  but  that 
in  general  the  dominance  of  one  eigenvector  for  phase  data  was 
harder  to  achieve  and  thus,  if  such  features  were  used,  more  eigenvec¬ 
tors  must  be  retained.  This,  plus  the  ease  with  which  magnitude 
Fourier  transform  features  can  be  optically  computed,  makes  such  a 
feature  space  preferable.  This  use  of  Fourier  transform  plane  magni¬ 
tude  and  phase  data  for  pattern  recognition  differs  considerably 
from  its  more  conventional  use  in  image  reconstruction.  Lastly,  we 
considered  the  noise  performance  and  robustness  of  Fourier  trans¬ 
form  magnitude  and  phase  features  and  found  magnitude  features  to 
be  far  preferable.  An  initial  heuristic  but  theoretical  basis  for  this 
result  that  appears  to  be  quite  plausible  was  advanced. 
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Chord  Distributions  in  Pattern  Recognition: 
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Abstract 


The  use  of  chord  distributions  in  pattern  recognition  is  discussed  and  efficient  ways  to 
compute  such  distributions  are  noted.  New  methods  to  achieve  scale  and  in-plane  rotational 
distortion-invariant  multi-class  recognition  and  estimates  of  the  distortion  parameters  are 
described.  3-D  out-of-plane  rotational  distortion -invariant  methods  are  reviewed. 


1. _ In  t roduct ion 


Chord  distributions  are  well-known  features  that  describe  the  shape  of  an  object  and  that 
are  useful  for  object  identification  [1-3].  These  features  can  easily  be  computed  (opti¬ 
cally  or  digitally)  from  the  autocorrelation.  In  Section  2,  we  define  the  chord  distribu¬ 
tion  and  discuss  different  chord  pdfs.  These  include  an  observation  space  h(.x, and  a 
feature  space  h(r)  and  h(-).  New  insight  is  provided  into  the  local  and  global  features 
produced  by  chord  pdfs  and  the  use  of  silhouette  and  boundary  (profile)  imagery.  In  Section 
3,  attractive  properties  of  these  chord  distributions  for  scale  and  in-plane  rotation  in¬ 
variance  are  discussed.  A  new  use  of  such  features  for  distortion-invariant  multi-class 
object  recognition  and  methods  to  extract  the  object's  scale  and  orientation  are  advanced. 

In  Section  4,  methods  to  achieve  3-D  object  distortion-invariance  (to  out-of-plane  rota¬ 
tions)  are  reviewed.  The  resultant  feature  extractor  thus  enables  multi-class  object 
classification  in  the  presence  of  a  wide  variety  of  geometrical  distortions. 


Chord  Features  and  Distributions 


2.1  Definition.  The  conventional  chord  distrih -tion  h(r,6)  is  a  plot  of  the  distribution 


of  the  lenqths  (r)  and  directions  (?)  of  all  chords  drawn  between  all  pairs  of  points  on  the 
boundary  of  the  object  f(x,y).  The  two  chord  pdfs  of  most  use  are  h(r)  and  h(?),  the  pdfs 
of  chord  lengths  r  and  directions  ?.  To  most  easily  compute  the  various  chord  distribu¬ 
tions,  one  can  begin  by  forming  the  autocorrelation 


b(x,y)  0b(x,y)  =  ..'  b(x,y)  b(x 


Cy ) dxdy 


R  ( f 


v  =  h(vy 


(i) 


of  the  boundary  b(x,y)  of  an  object.  The  autocorrelation  describes  the  number  of  points 
of  intersection  for  a  given  horizontal  and  vertical  shift  (ix,ty)  between  two  shifted 
images  of  the  object.  The  value  of  R  at  a  given  (ix,ty)  thus  precisely  gives  the  number 
of  chords  with  given  horizontal  and  vertical  projection  lengths  (ix,ly)  [3-4]. 


To  show  this,  we  write.,  (;x»Ey)  =  (r  cosc 


ri"x»'y'  _  ,  r  sin?)  where  r  =  (*>c  +fy^)  is  the  radial 

chord  length  and  -  =  tan  1(lx/>y)  is  the  chord's  angular  orientation.  Substituting  into 
(1),  we  see  that  R(x,(y)  contains  information  from  which  h(r,8)  can  be  obtained. 


From 


h  ( 


;x>fy),  the  chord  distribution  h(r,P)  can  be  calculated.  The  chord  pdfs  h(r)  and  h ( r ) 


are  more  useful  and  are  most  easily  calculated  from  h(Ix,ty)  by  appropriately  sampling  the 
autocorrelation  function.  If  the  autocorrelation  is  sampled  radially,  we  obtain 


h(r)  =  /h(>x,fy)rd? 


(2) 


If  we  sample  it  angularly,  we  obtain 


h ( ?  )  =  /  h( 


kx '  * y 


)  dr 


(3) 


2.2  Realization . 


These  h(r)  and  h ( ft )  chord  pdfs  are  the  features  we  will  use.  To  obtain 
(2)  and  (3)  optically,  we  form  h(;x,y)  optically  (typically  from  the  Fourier  transform  of 


the  power  spectrum  of  the  object)  and  sample  this  distribution  using  wedge  and  ring-shaped 
detector  elements  [4].  Such  a  detector  unit  exists  (Figure  1)  with  32  wedges  in  one-half 
of  a  circular  plane  and  32  rings  in  the  other  hall  of  the  plane  [5].  The  autocorrelation 
function  is  symmetric  and  thus  no  loss  of  information  results  by  sampling  only  half  of  the 
autocorrelation  plane.  In  terms  of  chord  distributions,  the  symmetry  of  the  autocorrelation 
function  arises  because  each  chord  in  the  image  is  counted  twice  as  one  traverses  the  boun¬ 
dary  of  the  object.  In  one  case,  one  end  point  of  the  chord  is  encountered  first  and  then 
the  other  end  point  is  encountered  first.  The  first  corresponds  to  a  chord  with  projections 
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<.x,.y)  and  a  length  r.  The  symmetric  case  corresponds  to  a  chord  with  projections 
(-■  ,-  v)  and  a  direction  rather  than  +  v.  For  similar  reasons  of  symmetry,  the  orienta¬ 

tion  of  the  wedge  and  ring  halves  of  the  detector  does  not  matter.  The  wedge  outputs  pro¬ 
vide  h(')  (quantized  to  32  f  values  over  180°)  and  the  ring  outputs  provide  h(r)  (quantized 
to  32  r  values  over  the  radius  of  the  autocorrelation  function).  Figure  2  shows 

the  general  block  dianram  of  our  chord  distribution  feature  generator  using  a  wedge-ring 
detector  (WRD ) . 


INPUT 

OBJECT 


— [>l  AUTOCORRELATION  H> 


WRD 

SAMPLING 


h(r) ,h(6) 


t> 


FEATURE  EXTRACT  I  ON 
(FISHER)  | 
AND  CLASSIFICATION 


Figure  1.  Simplified  representa¬ 
tion  of  a  wedge-ring 
detector  (WP.D)  . 


Figure  2.  Simplified  Block  Diagram  of  a  chord  dis¬ 
tribution  pattern  recognition  system. 


2.3  Boundary,  Silhouette  and  Gray-Level  Objects.  Different  chord  distributions  result 
depending  on  the  type  of  input  object.  For  a  boundary  or  edge  image  (case  A),  the  distri¬ 
bution  produced  is  of  the  number  of  edge  or  boundarv  pixels  (i.e.,  the  number  of  chords). 
This  is  the  conventional  chord  distribution.  For  a  silhouette  image  (binary  with  all  ones 
on  the  object  and  with  zeroes  on  the  background) ,  the  distribution  produced  (case  B)  is 
the  same  as  case  A,  but  weighted  by  the  common  area  of  overlap  of  the  two  images  for  the 
given  ( • x ,  ■- y )  shift.  If  the  shift  is  large,  corresponding  to  long  chords,  the  weighting 
will  be  small.  However,  if  the  shift  is  small,  corresponding  to  short  chords,  the  weight¬ 
ing  will  be  large.  Thus,  this  weighted  chord  distribution  that  results  for  the  case  of  a 
silhouette  object  (case  B)  emphasizes  short  chords  more  than  long  chords.  The  chord  dis¬ 
tribution  in  case  A  will  be  more  susceptible  to  noise  in  the  interior  of  the  object 
(internal  pixels  of  value  1  result  in  many  new  chords  being  produced  in  case  A,  whereas  in 
case  B  zero  internal  pixels  cause  a  loss  of  chords  but  a  much  lower  percent  change  results 
than  in  case  A) .  When  the  chord  distribution  in  case  A  is  computed  from  the  autocorrela¬ 
tion  or  power  spectrum  (as  in  Sections  2.1  and  2.2),  it  is  much  simpler  to  calculate  than  by 
other  methods  which  have  great  difficulty  when  applied  to  a  non-continuous  boundary.  How¬ 
ever,  each  missing  boundary  pixel  in  case  A  will  still  result  in  a  loss  in  the  number  of 
chords  counted . 


The  weighted  chord  distribution  (case  B)  emphasizes  short  chords.  These  correspond 
to  local  object  features  (whereas  long  chords  correspond  to  global  object  features) .  Since 
local  object  features  are  useful  for  discrimination  between  object  classes  (inter-class), 
we  expect  the  weighted  chord  distributions  to  provide  superior  object  discrimination.  Long 
chords,  corresponding  to  global  object  features,  are  more  useful  for  intra-class  object 
recognition  (within  one  object  class,  in  the  face  of  various  object  distortions).  The  per¬ 
formance  of  weighted  chord  distribution  features  in  the  presence  of  noise  in  the  input  is 
expected  to  be  superior  to  the  use  of  conventional  chord  features.  In  a  boundary  image 
(case  A)  with  N  pixels  on  the  boundary,  each  noise  pixel  on  the  object  produces  N  new 
chords  and  each  missing  boundary  pixel  (due  to  noise)  causes  N  chords  to  be  removed  from 
the  distribution.  With  N2  total  chords,  each  noise  pixel  thus  changes  the  total  h  by  a 
factor  1/N.  In  case  B,  each  weighting  function  is  on  the  order  of  N2  (this  is  more  true 
for  short  chords  than  long  chords)  and  thus  each  noise  pixel  produces  a  change  in  h  by  a 
factor  of  only  1/N2  (this  is  a  considerable  improvement,  since  N  is  usually  quite  large). 
For  the  same  reason  that  the  change  in  h  for  short  chords  is  less  susceptible  to  noise,  it 
will  also  be  less  susceptible  to  small  differences  in  the  object's  shape  (due  to  distor¬ 
tions).  but  changes  due  to  sufficiently  different  objects  are  still  retained. 

The  dynamic  range  of  the  chord  features  in  cases  A  and  B  appears  to  be  comparable. 

Since  use  of  the  boundary  image  (case  A)  whitens  the  image's  spectrum  and  results  in  a 
sharper  correlation  function  compared  to  the  broader  correlation  pattern  that  results  in 
case  B,  wedge-ring  detection  in  case  B  is  much  simpler.  Case  B  is  clearly  preferable  from 
noise  considerations,  its  inter-class  discrimination  is  clearly  enhanced  and  its  intra¬ 
class  recognition  should  be  retained.  Since  all  chords  are  available  (and  more  easily 
detectable  in  case  B) ,  one  can  use  the  preferable  chord  features  (short  or  long,  local  or 
global)  for  a  given  problem. 
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If  the  gray-levels  of  the  object  and  its  internal  structure  are  reliable,  then  the  chord 
distribution  for  the  gray-level  image  (case  C)  is  most  useful.  The  distribution  in  case  B 
is  one  level  of  a  general  chord  distribution.  The  distribution  in  case  C  is  a  higher- 
level  of  generalized  chord  distribution  [4].  In  this  case,  the  chord  distribution  for  all 
internal  chords  or  internal  object  points  is  provided.  Algorithms  such  as  (1)  with  the 
boundary  object  b(x,y)  replaced  by  the  full  object  f(x,y)  provide  such  features  with  no 
increase  in  computational  load  for  optical  systems  (digital  systems  can  achieve  simplified 
correlations  when  operating  on  binary  imagery) . 

3.  Scale  and  Rotation-Invariant  Chord  Processor 

3 . 1  Insight .  The  chord  pdf  h(r)  is  invariant  to  in-plane  rotation  of  the  object.  This 
is  obvious  since  the  in-plane  rotation  of  an  object  does  not  alter  its  radial  distribu¬ 
tion.  The  chord  pdf  h(-)  simply  shifts  with  in-plane  rotations.  This  follows  directly 
since  hi<>x,.y)  =  hfrcos-,  rsinF)  changes  to  l>2('x,fy>  =  h [ rcos ( - +  ?g ) ,  rsin  (  •+  o))  for  ro¬ 
tation  of  the  input  object  by  ig,  i.e.  in  (r,t?)  space,  =  hy  (r ,  r  +  9g  )  .  Thus  in¬ 

plane  object  rotations  rotate  h(lx,Cy)  and  translate  h ( 0 ) .  The  chord  pdf  h(t)  is  invariant 
to  scale  distortions  of  the  object  whereas  h(r)  scales  (rather  than  shifts)  with  an  input 
scale  change  a.  The  invariance  of  h(i)  with  scale  is  obvious.  For  a  scale  change  .  in 
the  input  object,  the  h(r)  distribution  scales  proportional  to  o  and  h(ar)  is  obtained. 

As  long  as  half  of  the  correlation  plane  is  sampled  in  6  and  r,  the  above  remarks  remain 
valid  (due  to  the  symmetry  of  the  autocorrelation  and  due  to  the  cyclic  shift  nature  of 
h ( r ) ] .  Table  1  summarizes  these  properties. 


Table  1.  Properties  of  R(r)  and  h(?)  distributions 


PARAMETER 


Distribution  Property 


Amplitude  Effects 


h  (r) 

Invariant 

Rotation,  r 

h  ( 6 ) 

Shifts  « 

Scales  r  ■*  ar 


Invariant 


Table  1  also  notes  the  effects  on  the  amplitudes  of  the  h(r)  and  hfi)  features.  We 
now  detail  the  origin  of  these  variations.  We  consider  first  the  effect  of  a  scale  change 
(by  a  factor  of  a)  in  the  input  object  on  the  amplitudes  of  h(r)  and  h(- ) .  First,  we 
consider  the  observation  space  h(;x,ly).  The  image  f(x,y)  with  scale  ■  =  1  produces  h^ . 
This  relates  to  h2  for  a  4  1  as  detailed  below.  From  (1), 

h.  (■  ,  1)  =  f  (x  ,y(  f  (x  i  .  ,y  +  •  >dxdy  (4) 

l  x  y  x  j 


For  the  scaled  object  (scale  factor  .) 

h9(;  ,  ■  )  =  //  f (ax , ay )  f I  > (x 

2  x  y 

Changing  variables  (u,v)  =  (ax, ay),  we  obtain 

L  ,  \  _  ,  1  /  2  .  r  r  e  .  ,  Cl,,  C 


■  > .  ‘  <y  +  >  ) ) dxdy 

a  y 


,»  )  =  (1/a  )  //  f(u,v)  f  (u  +  a  •  ,  v  +  >  )dudv  “  l 

z  x  y  x  > 


x  y 

(6) 


From  (6),  we  see  that  h.  is  a  scaled  version  of  h.,  with  the  amplitudes  scaled  by  (!/■  ) 
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flow  we  consider  the  effect  of  scale  changes  on  the  h(r)  and  h(-)  distributions.  For 
the  htr)  distribution,  we  find,  from  (2), 

h^(r)  =  i  f(rcos'  ,  rsin-lrd-  .  (7) 

For  a  scaled  object  (scale  factor  :>)  , 

2  1  1 

h^lr)  =  (l/:t  )  ,'f  (urcos;  .  irsin' ;  rdr  =  (l/i  )  /f  ( ircosi1 ,  irsin-  )  :<rd-  -  (l/i  )  h  ^  (  .  r )  (6) 

Thus,  from  (7),  we  find  a  scale  change  (by  n)  between  the  h^(r)  and  h  (r)  distributions 
and  an  amditude  scale  factor  (1/a  ).  For  h(a),  the  effect  of  a  scale  change  j  is  simply 

h2(')  =  (l/a3)h1<  )  ,  (9) 

i.e.,  only  an  (l/>3)  amplitude  factor. 

The  distribution  and  amplitude  effects  of  f  and  a  distortions  summarized  in  Table  1 
and  detailed  above  are  valid  for  continuous  data  and  continuous  r  and  -  sampling.  Finite  r 
and  -  sampling  is  expected  to  change  the  exact  results  somewhat.  Specifically,  due  to 
sampling,  an  exact  ratio  of  >3  is  not  exDected.  Furthermore,  the  scale  change  from  h(r) 
to  h  ( a  r )  can  be  quite  difficult  to  uncover  since  the  distribution  for  one  scale  may  lie 
in  11  rings  and  the  distribution  for  another  scale  can  easily  lie  in  6  or  8  rings.  Thus, 
the  h  < •■ )  distribution  is  the  most  useful  one  for  general  (a  plus  P.)  distortions.  The  h(r) 
scale  r  changes  linearly  (to  ■■ )  and  is  thus  not  a  simple  shift.  When  the  effect  of  a 
finite  number  of  r  samples  is  included,  the  h(r)  effect  with  a  is  nonlinear.  If  we  scale 
the  h(r)  distribution  in  r  by  a,  the  ratio  * 3  then  exists  between  the  h(r)  for  a  scaled 
object  and  the  original  h(r)  scaled  in  r  by  a.  Thus,  the  distribution  and  amplitude  effects 
of  scale  are  coupled  as  just  detailed.  Specifically,  this  means  that  the  amplitude  ratio 
is  a3,  but  it  is  this  for  different  r  and  ir  points  in  the  distribution  (not  the  same  r 
points) . 

By  g (x, y)  =  f(c.x,oy),  we  describe  both  the  position  and  value  of  the  pixels.  Specifically 
new  pixel  (x,y)  is  old  pixel  (ax, ay)  (i.e.  a >  1  corresponds  to  a  scale  decrease)  and  the 
value  of  the  old  and  new  pixel  are  the  same.  Our  above  formulae  for  amplitude  effects  pro¬ 
portional  to  a-3  thus  apply  for  binary  silhouette  images  (analogous  formulae  for  grayscale 
images  can  be  derived  and  used  if  the  input  data  is  gray-scale.  In  such  cases,  with  a<  1, 
we  have  a  larger  image  with  more  pixels  and  more  intensity  per  pixel,  since  the  object  is 
closer  and  received  intensity  is  proportional  to  range  squared) .  For  binary  silhouette  images 
and  a  <  1 ,  the  new  image  is  larger.  Thus,  for  a  given  (£x,£y)  shift,  we  obtain  more  overlap, 
larger  correlation  values,  more  weighting  and  more  chords.  Our  new  h2  will  have  larger  ampli¬ 
tudes  (more  chords)  than  hi  and  this  agrees  with  h2  =  a-2hi  >  hi  predicted. 

3.2  Distortion  -  Invariant  (a  and  8g)  Pattern  Recognition.  The  insight  provided  in  Sec¬ 

tion  3.1  and  the  distortion  effects  summarized  in  Table  1  are  most  useful  in  devising  a  new 
pattern  recognition  feature  extractor  (invariant  to  scale  a  and  in-plane  rotation  cg 
distortions) .  We  consider  3  distortion  cases  separately  below  and  summarize  our  results  in 
Table  2.  From  Table  1,  we  note  that  the  h(?g)  distribution  is  the  most  useful  one  in 
general  (since  it  provides  invariance  to  scale  automatically  and  to  rotations  if  shifted 
versions  of  h ( 6 )  are  tested;  and  since  the  ratio  of  h(P)  and  a  reference  hR(P)  provides  an 
estimate  of  a,  whereas  the  best  shift  of  h(?)  provides  an  estimate  of  6g).  For  only- 
scale  distortions,  h(f)  is  best,  and  for  only  rotation  distortions,  htr)  is  best  for 
classification  (since  these  features  are  invariant  to  the  indicated  distortions) . 

3.2.1  In-Plane  Rotations.  For  the  case  when  6g  is  the  only  distortion  present,  we  com¬ 
pare  the  h(r)  distribution  hp(r)  for  all  references  R.  This  provides  an  estimate  of  the 
object  class  R.  Next,  for  the  best  reference  R  (obtained  from  the  h(r)  and  hR(r)  compari¬ 
sons),  we  compare  h(6)  and  hp(?)  for  various  shifts  rg  in  hR(P).  From  the  hp(f  +  6g)  and 
h ( - )  comparisons,  we  obtain  a  verification  of  our  initial  class  estimate  R  and  an  estimate 
of  6-g.  A  combination  of  both  h(r)  and  h(P)  tests  thus  provides  the  best  class  R  estimates. 

3.2.2  Scale  Changes.  For  the  case  of  an  a  distortion  alone,  we  compare  h(6)  for  the  test 
inDut  vs.  hR(-)  for  all  references  R.  We  must  compare  h(t-)/hR(')  for  each  P.  The  reference 
R  for  which  this  ratio  is  constant  for  all  ?  provides  the  class  estimate  R.  The  ratio 
h(i)/hp(':‘)  provides  an  estimate  of  a  also.  To  confirm  our  R  and  a  estimates,  we  form  h(r) 
and  hplar)  for  the  initial  R  and  a  estimates.  Agreement  of  h(r)  and  hR(ar)  confirms  our 
initial  estimates.  Combining  both  the  h(P)  and  htr)  tests  again  yields  better  estimates. 

3.2.3  Combined  Scale  (a)  and  rotation  (Pg)  Distortions.  When  both  a  and  6g  distortions 
are  present  (the  most  general  case),  analysis  relies  on  h(fc)  and  is  more  complex.  We 
form  h(“)/hR(-  +  -g)  for  all  R  and  all  shifts  -  g.  When  this  ratio  is  constant  for  all 
the  corresponding  R,  a  and  8  estimates  are  obtained.  The  ratio  provides  the  a  estimate. 
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Table  2  Scale  .  and  In-Flane  Rotation  Q  Invariant  Mu 3 t i -Cl  ass  Pattern  Recognition 


Procedure 


Results 


Compare  h(r)  and  h  (r) 
R 


h(r)  is  Rotation  Invariant 


Rotation 

■  On  1  v 


( B ) 

Scale 
:  on  1  y 


Compare  h  (••  +  ••  )  and  h(  )  ht  )  shifts  with 

R  U  (J 


I  Compare  h ( *  )  /r_  (  ) 
I  * 

for  each  • 


Compare  h(r)/h_(»r) 
K 


Constant  Ratio  Provides  R 
Ratio  Provides  .<  Estimate 

Confirms  above  estimate 


Class  P.  Estimate 

Confirms  F 
Estimate 

Provides  ■ 
Estimate0 


Class  R  and 
Scale 
Estimates 

Confirms  R  a 

Estimates 


(C) 

Rotation  g 

and  Scale  ■ 

I  Compare  h()/hR(-  + 

for  all  R  and  all  shifts  f-^ 

Constant  Ratio  Provides 

R  and  v . .  Ratio  gives  :■ 

Initial  Esti¬ 
mates  of 

R,  '  0 »  ^ 

Compare  h(r)/hR(ar) 

Confirm  above  Estimates 

. 

Confirm  R  and 

Estimates 

As  a  check,  we  form  h(r)/hR(-»r)  for  the  initial  R  and  a  estimates.  From  the  constancy 
of  the  ratio,  we  verify  our  R  and  ;<  estimates.  Forming  h(r)/hR(ir)  initially  for  all 
is  more  computationally  intensive  and  thus  the  order  chosen  appears  best.  This  is  also  the 
most  general  case. 

4.  Out-Of-Plane  Distortions 

For  .i  and  distortions,  we  require  one  h(r)  and  h(~)  distribution  per  class  R  for  our 

training  set.  To  accommodate  out-of-plane  distortions  :  ,  we  use  several  training  set 
images  Der  object  class  and  from  all  h(r)  and  h  ( “  )  features  select  those  with  the  largest 
Fisher  ratio  F  (from  training  set  data).  We  then  form  a  linear  discriminant  functions  w 
that  maximizes  F  for  a  multi-class  feature  set.  An  input  test  feature  vector  c  (chord 
distribution)  is  projected  onto  w  and  the  projection  value  determines  the  input  object 
class.  This  algorithm  [4]  has  demonstrated  perfect  performance  in  selected  image  distor¬ 
tion  tests. 

5  .  Summary ■ 

Chord  distributions  h(r)  and  h ( •  )  have  been  shown  to  be  easily  computed  from  the 
autocorrelation  of  the  input  object  and  WRD  (radial  and  angular)  sampling.  Using  the 
various  properties  (Table  1)  of  h(r)  and  h(-),  a  new  multi-class  pattern  recognition 
system  for  scale  and  in-plane  rotational  distortions  was  advanced  (Table  2).  Combined 
with  our  prior  out-of-plane  rotational  distortion  work  (Section  4),  this  feature  space 
can  provide  full  3-D  object  distortion  invariance  and  estimates  of  the  distortion  para¬ 
meters  (orientation  and  scale)  of  the  object. 
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ABSTRACT 

A  two-level  feature  extraction  classifier  using  a  geometrical-moment  feature  space  is  de¬ 
scribed  for  multi-class  distortion-invariant  pattern  recognition.  The  first-level  classi¬ 
fier  provides  object  class  and  aspect  estimates  using  multi-class  Fisher  projections  and  op¬ 
timized  two-class  Fisher  projections  in  a  hierarchical  classifier.  Aspect  estimates  are 
provided  from  ratios  of  the  computed  moments.  The  second-level  classifier  provides  the 
final  class  estimate,  distortion  parameter  estimates  and  the  confidence  of  the  estimates. 
Extensive  test  results  on  a  ship  image  database  are  presented. 

1 ■  INTRODUCTION 

One  can  efficiently  compute  the  moments  of  an  input  object  by  various  methods  [1,2]. 

These  features  are  excellent  descriptions  of  the  geometrical  aspects  of  an  object.  They  are 
quite  unique  since  they  can  provide  information  on  the  orientation,  scale  and  location  of 
the  input  object  [2]  and  because  they  can  be  corrected  for  various  system  computing  errors 
[3],  In  this  paper,  our  earlier  moment  classifier  [2]  is  modified  to  include  a  two-level 
classifier  (Section  2).  This  provides  significantly  improved  performance.  We  earlier  [4] 
described  initial  results  for  robotic  object  parts.  Here,  we  detail  the  new  two-level  clas¬ 
sifier  design  (Section  2),  and  the  performance  obtained  (Section  4)  for  an  extensive  ship 
image  database  (Section  3)  . 


2.1  Moment  Statistics 

The  geometrical  moments 


2 .  NEW  MOMENT  -  BASED  CLASSIFIER 


m  =  // f (x, y) xpy^dxdy 
P9I 


of  an  input  object  f(x,y)  are  jointly-Gaussian  random  variables  (JGRV)  [6]  due  to  the  finite 
spatial  sampling  of  the  input  image  and  they  are  good  estimates  of  the  actual  moments  of  an 
input  object.  This  JGRV  model  allows  us  to  use  a  conventional  Bayesian  classifier  [5]  that 
minimizes  the  probability  of  incorrect  class  estimates  (Section  2.4).  The  mean  ju.  and  co- 
variance  for  each  object  class  i  must  be  estimated  to  use  this  classifier.  Generally, 
this  requires  a  training  set  of  imagery.  Because  the  moment  features  are  JGRVs ,  we  require 
only  one  object  view  per  class  to  achieve  such  estimates.  Thus,  such  a  classifier  using 
these  geometrical  moment  features  does  not  require  a  large  training  set  of  data. 

2.2  Aspect  Angle  Estimator  (First-Level  Classifier) 

The  moment  features  are  JGRVs  only  with  respect  to  scale  (a,b),  translation  (xo,yo)  and 
in-plane  rotations  (5),  but  not  for  out-of-plane  rotations  (4).  Thus,  we  must  estimate  4 
for  the  input  object.  This  is  achieved  in  our  first  -  level  classifier,  which  thus  in- 
cludeseach  image  aspect  view  as  a  separate  object  class.  We  thus  distinguish  object  classes 
(in  our  present  database  tests,  Section  3,  this  refers  to  different  ship  classes)  from  view 
classes  (these  include  all  aspect  views  of  all  ship  images).  In  our  first-level  classifier, 
we  estimate  the  aspect  angle  of  the  input  object  from  the  ratio  A  =  U20/u02  °f  the  central 
moments,  where  „20  =  m2o_™f  p/moo  an&  ^02  =  n'02“nl(5  1  /m0G  •  for  all  reference  objects  in  the 
class  being  tested,  we  calculate  A  and  then  form  K  =  A/A.  The  aspect  view  with  the  K  value 
closest  to  unity  is  selected  plus  all  aspect  views  with  K  <_  T^  (the  aspect  threshold)  .  In 
our  tests,  we  use  T^  =  1.5.  Those  aspect  views  of  the  class  being  tested  with  K  ^  T^  are 
passed  to  the  second-level  classifier. 

2.3  Object  Class  Estimator  (First-Level  Classifier) 

To  further  reduce  the  number  of  view  classes  (aspect  plus  class)  passed  to  the  second- 
level  classifier,  we  use  multi-class  and  two-class  Fisher  projections  [7]  on  a  *raining  set 
of  ship  images.  From  these  scatter  plots  for  the  multi-class  Fisher  projectioi • ,  we  select 
the  two  subsets  of  object  classes  that  are  best  separated  at  each  node  in  a 
tree  classifier.  For  each  node,  we  then  calculate  (from  training  set  images)  the  two-class 
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Fisher  vector  that  best  separates  and  clusters  the  two  subsets  at  each  node.  For  example, 
for  node  0,  the  full  set  of  N-l  multi-class  Fisher  vectors  Fj  to  Fjg-i  for  the  N  object 
classes  are  computed.  From  examination  of  the  projections  of  the  inputs  onto 

the  two  most  dominant  Fisher  vectors,  we  select  the  two  subsets  (with  possibly) 
several  object  classes  per  subset)  to  be  separated  at  node  0.  The  two-class  Fisher  vector 
for  these  two  subsets  is  then  calculated  and  the  projections  of  all  training  set  data  on 
this  Fisher  vector  are  plotted.  From  this  plot,  weighted  distances  to  the  two  class  means 
were  calculated  and  a  class  estimation  threshold  Tqi  is  selected.  If  the  weighted  distance 
for  the  projection  of  an  input  test  image  exceeds  Tci ,  then  we  proceed  down  the  correspond¬ 
ing  branch  at  that  node  of  the  hierarchical  tree.  On  each  branch,  another  node  is  present 
at  which  the  classes  on  that  branch  are  further  divided  intc  two  smaller  subsets.  Hew  multi¬ 
class  Fisher  projections  are  used  at  each  node  to  determine  the  two  subsets  to  use  and  a  new 
two-class  Fisher  projection  vector  is  calculated  for  use  at  each  node. 

All  of  these  calculations  are  performed  off-line  on  a  limited  number  of  training  set 
images.  To  account  for  scale  and  translational  distortions  in  the  input  image,  the  central 
moments  normalized  for  scale  are  used  in  the  first-level  classifier  and  the  scatter  plots 
are  calculated  for  different  aspect  views  of  each  class.  Details  and  examples  of  this 
organized  first  -  level  class  estimator  are  provided  in  Section  3.  This  hierarchical  pro¬ 
cedure  is  followed  until  terminal  nodes  are  reached  and  a  decision  on  the  class  estimate (s) 
of  the  input  object  is  made.  For  sane  objects  [4],  full  separation  into  all  classes  is  not 
possible.  If  the  calculated  weighted-distance  measure  for  the  input  test  image  is  less 
than  Tci ,  all  classes  at  that  node  are  passed  to  the  next  level.  Use  of  alternate 
nodes  is  included  to  allow  better  separation  of  subsets  at  certain  nodes  for  particular 
databases.  The  real-time  calculations  involved  in  this  hierarchical  class  estimator  are 
quite  simple.  The  test  feature  vector  is  simply  projected  onto  several  discriminant  vectors 
(each  such  operation  is  merely  a  vector  inner  product)  and  from  the  projected  values,  class 
estimate(s)  are  obtained.  For  each  such  class  estimate,  the  aspect  class  estimator  (Section 
2.2)  is  used  to  determine  the  total  number  of  view  classes  to  be  processed  in  our  second- 
level  classifier. 


2.4  Bayesian  Classifier  (Second-Level  Estimator) 

Because  the  operations  required  in  the  Bayesian  classifier  are  computationally  more  in¬ 
tense,  the  first-level  estimator  is  used  to  reduce  the  number  of  view  classes  to  be  pro¬ 
cessed  in  the  second  -  level  estimator.  The  conventional  Bayesian  classifier  minimizes 
the  probability  of  an  incorrect  class  i  estimate  (here  i  denotes  a  view  class) .  Using  the 
assumptions  of  JGRV  features,  the  discriminant  function  to  be  minimized  is  [5] 

gi(x)  =  (x-P^V^x-p^,  (2) 


where  ^i  and  li  =  Z  are  the  mean  vector  and  covariance  matrix  for  class  i.  For  our  present 
case,  the  feature  vector  x  is  a  moment  vector  m  and  thus  only  one  object  view  per  class  is 
needed  to  measure  and  Z±.  Operation  of  such  a  classifier  thus  proceeds  by  calculating 
gi(x,y)  for  the  measured  input  feature  x=m  for  all  object  classes  i.  The  class  i  that  mini¬ 
mizes  gi(x)  is  the  best  class  estimate  in  a  Bayesian  sense.  The  discriminant  function  in 
(2)  is  the  Mahalanobis  distance.  If  I  =  I,  it  becomes  the  Euclidean  distance  measure  or  a 
nearest-neighbor  classifier.  Use  of  Z  =  1  assumes  that  all  moments  are  independent  and  that 
the  expected  variations  of  all  moments  are  equal. 


To  utilize  (2),  i  must  be  a  view  class.  To  calculate  all  object  class  distortion  param¬ 
eters,  i.e.,  scale  (a,b),  range  (R) ,  translation  (xo,yo>  and  in-plane  rotations  (?)  as  well 
as  aspect  view  angle  (f),  we  let  the  view  class  i  include  the  object  class  and  aspect  view 
angle  and  we  include  the  other  parameters  in  a  distortion  parameter  vector  b  =  (xo , yo , a , b, R , 
9).  We  combine  the  view  class  and  distortion  parameters  as  mj (b)  and  thus  evaluate  (2)  for 
all  view  classes  i  and  all  distortion  parameter  vectors  b.  Since  m(b)  is  a  nonlinear  func¬ 
tion  of  b,  we  use  an  iterative  algorithm  of  the  form 


,  k+1 
b 


k  k 
a  r  , 


(3) 


where  b*t  is  the  b  estimate  at  iteration  k  and  b*<+l  is  a  point  in  an  r-dimensional  space  at 
a  distance  ak  in  the  direction  r^  f ran  the  present  estimate  b^ .  To  determine  the  complete 
form  for  the  iterative  algorithm  in  (3) ,  we  expand  mj  (b)  in  a  Taylor  expansion  series  about 
the  present  b*  point  as 


nulb)  =  Dlj.  <bk)  +  Jk(b-bk),  (4) 

where  J  is  the  Jacobian  of  m^ (b)  with  respect  to  b  at  the  k-th  iteration.  For  a  measured 
input  feature  vector  m,  the  error  to  be  minimized  is  ej  =  m  -  m x ( b )  and  the  square-error 
measure  is  •  ®iTi_^£i »  where  j;_l  is  the  weighting  matrix  used.  Substituting  e^  and  (4) 
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into  the  expression  for  Ej. ,  the  b  that  minimizes  (b)  is  found  to  satisfy 

.k+1  _  .k  .  ■  ,  Tk.Tv-l  ,k. -1  _k.  Tr-1  „  ..  .  , 

b  =  b  +  !  ( J  )  l  J  j  (J  )  £  in  -  Oli  I  • 


(5) 


Eq(5)  is  the  nonlinear  iterative  algorithm  used  in  our  second-level  classifier  to  estimate 
b.  Thus,  for  each  view  class  i  (5)  is  repeated  and  new  b  estimates  are  obtained.  For  each 
b^,  we  calculate  the  normalized  difference 

AOi  =  Igk(i,b)  -  gk_1(i,b) ]/gk(i,b)  (6) 

between  two  successive  g^  estimates,  where  g^ (b)  =  E^ .  The  iterations  in  the  Gauss-Newton  or 
Newton  algorithm  in  (2)  and  (5)  are  continued  until  ig^  is  less  than  a  convergence  threshold 
T. 


2.5  Parameters  and  Overview 

The  full  moment-based  two-level  estimator  is  shown  in  block  diagram  in  Figure  1.  It  con¬ 
sists  of  an  optical  moment  feature  computer,  first-level  class  and  aspect  estimators,  and 
the  second-level  Bayesian  classifier.  The  output  from  the  two  first-level  estimators  are 
used  to  access  those  reference  moment  vectors  necessary  for  the  second-level  nonlinear  iter¬ 
ative  classifier.  The  final  outputs  are  the  class  estimates  i  (class  and  aspect  angle  :), 
the  target's  distortion  parameters  or  orientation  information  b  and  the  confidence  g^  of  the 
estimates. 
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FIGURE  1 

Block  Diagram  of  a  Two-Level  Moment-Based  Classifier 


To  facilitate  b  estimates,  xo  and  yo  are  estimated  from  -mio/^OO  anc3  ~™0l/™00  and  scale 
is  estimated  from  m^oQ-  To  facilitate  calculations,  J  is  evaluated  with  (xo,yo<a<b)  = 

(0,0, 1,1),  i.e.  assuming  that  the  presently  calculated  distortions  b  are  correct  and  thus 
viewing  future  iterations  as  updates  on  the  present  b^  rather  than  the  initial  b®  estimates. 
These  and  other  features  of  the  iterative  algorithm  allow  it  to  converge  in  typTcally  less 
than  15  iterations.  Different  approximations  to  were  considered  in  our  case  study. 

Such  measures  were  essential  since  is  ill-conditToned .  Approximations  considered  were: 

2  and  Z~  1  =  W  wT,  where  W  is  the  multi-class  Fisher  projection  matrix  of  the  reference  vector 
set.  The  iterative  convergence  threshold  T  is  typically  chosen  as  0.01.  This  corresponds  to  a  1%  difference  in 
successive  iterates  as  in  (6).  The  class  estimation  threshold  Tci  =  [l-di/d2l  is  chosen  as 
0.35,  where  d^  and  dj  are  the  distances  of  the  projection  to  the  two  weighted  class  bounda¬ 
ries  at  each  node  in  our  first-level  class  estimator.  The  class  estimation  threshold  Tc2 
for  the  second-level  Bayesian  classifier  is  defined  similarly  and  is  chosen  to  be  0.35  also. 


3 .  SHIP  DATABASE 


3.1  Image  Sets 

Ships  on  the  open-sea  represent  an  attractive  application  for  feature-space  techniques 
(since  one  object  can  often  easily  be  included  in  field-of-view) .  The  class,  orientation 
and  range  of  the  object  in  this  application  are  necessary  for  missile  guidance  and  target 
selection.  The  set  of  ship  imagery  available  consisted  of  five  ship  models  with  36  different 
aspect  views  per  ship  class  available  from  a  90°  depression  angle  (0°  attack  angle)  at  10° 
intervals  (a  total  of  180  view  classes) .  Figure  2  shows  the  broadside  views  of  the  five  ship 
classes . 
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FIGURE  2 

Broadside  Views  of  the  Five  Ship  Classes 


TABLE  1 

Ship  Image  Database,  36  Images  Per  Class 


CLASS 

NUMBER 

SHIP 

NAME 

SHIP  TYPE 

0 

Moskva 

Soviet  Helicopter  Cruiser 

1 

Leahy 

U.S.  Guided-Missile  Cruiser 

2 

Hope 

Hospital  Ship 

3 

Albany 

U.S.  Guided-Missile  Cruiser 

4 

Brooke 

U.S.  Guided-Missile  Frigate 

Table  1  lists  the  names  and  types  of  each  general  ship  class.  For  each  ship,  the  original 
images  were  binarized  and  data  sets  with  and  without  the  hull  removed  were  prepared.  All 
data  included  were  obtained  with  the  hull  present.  Each  image  was  128  x  32  pixels  with 
approximately  2000  pixels  on  the  broadside  views  and  less  than  200  pixels  on  the  bow  and  stern 
views.  Several  other  ship  image  databases  used  are  noted  in  Table  2.  These  include:  the 
standard  reference  images  used  in  the  second-level  classifier  (these  include  only  four 
images  in  the  first  quadrant,  broadside  images  only  and  other  selected  object  views). 

TABLE  2 

Miscellaneous  Image  Training  and  Test  Sets  Used 


DATA  SET 

SPECIFIC  SHIP  IMAGES 

SYMBOL 

Standard  Reference  Images 

10° ,30° ,50° ,80° 

S 

Broadside  Images 

40°-l 40°  ,220°-320° 

B 

Even  Views 

0°  ,20° , etc. 

E 

Odd  Views 

10° , 30° , etc. 

0 

All  Views 

0°, 10°, 20°, etc. 

A 

3.2  Hierarchical  Tree 

In  Figure  3,  we  show  the  scatter  plot  for  all  views  of  all  five  ship  classes  on  the  two 
dominant  multi-class  Fisher  vectors.  As  seen,  ship  class  two  is  the  most  easily  separated. 
Thus,  at  node  0  we  chose  to  separate  the  class  two  ship  (the  Hope)  from  the  others.  This 
yields  a  terminal  node  for  one  branch  from  node  0.  At  node  1,  we  examined  a  similar  scatter 
plot  for  classes  0,1,3  and  4  and  chose  to  separate  the  class  0  ship  (the  Moskva)  from  the 
three  U.S.  guided-missile  ships.  At  node  2,  we  then  separated  three  ships  (the  Brooke,  a 
Frigate,  from  the  two  cruisers)  and  finally  at  node  3  we  separated  the  two  U.S.  cruisers. 
Figure  4  shows  the  final  hierarchical  tree  used  for  our  ship  image  database. 
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FIGURE  3 

Sample  Multi-Class  Fisher  Feature  Scatter 
Plot  of  all  Ship  Images  (node  0) 


FIGURE  4 

Ship  Node  Tree  Constructed  from  Multi-Class 
Fisher  Projections  and  Scatter  Analyses 


3.3  Node  Threshold  Selection 

In  Figure  5,  we  show  the  projection  of  the  subsets  at  node  2  in  the  tree  of  Figure  4  (a  1 
denotes  a  class  four  projection  and  a  0  denotes  class  one  and  three  projections) .  The  dis¬ 
crimination  point  D  is  the  point  where  the  weighted  distances  to  the  means  (yg  and  yj)  of 
the  two  subsets  are  equal.  The  lower  bounds  Do  and  Di  for  each  subset  are  noted.  For  less 
uniform  clusters,  Dq  and  Di  are  selected  at  several  standard  deviations  from  yg  and  yj  .  The 
weighted  distances  Dg  and  D{  (normalized  to  1.0)  from  Do  to  yg  and  Dj  to  y^  respectively 
were  calculated.  The  D'  values  for  all  nodes  were  found  to  lie  in  the  range  from  0.35  to 
0.45.  Thus,  Tel  =  0.35  was  selected.  If  more  noise  is  expected  in  the  input  data,  Tqj  can 
be  lowered.  However,  if  the  wrong  class  estimate  is  passed  from  the  level-one  classifier, 
this  will  be  quite  detrimental  to  performance.  Thus,  the  use  of  a  lower  Tci  threshold 
should  be  carefully  considered.  In  subsequent  tests,  we  verified  that  the  same  hierarchical 
tree  structure  of  Figure  4  would  be  chosen  from  a  significantly  reduced  set  of  16  reference 
images  (specifically  4  images  in  each  quadrant) .  The  Tqj  value  was  similarly  found  to  be 
unchanged  when  this  reduced  set  of  training  set  images  was  used.  This  is  useful  to  retain 
the  reduced  size  training  set  advantages  possible  with  JGRV  features. 
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FIGURE  5 

Projection  of  Subsets  at  Ship  Node  2  Showing  the  Discriminant  Point, 
Projection  Means  and  Threshold  Region 


4. 


EXPERIMENTAL  RESULTS 


Various  aspects  of  the  classifier  were  separately  investigated.  Each  subsection  below 
addresses  one  major  issue  of  our  moment-based  classifier  for  the  case  of  a  ship  image  data¬ 
base.  In  each  case,  the  test  number  is  noted  together  with  the  salient  conditions  and  the 
percentage  of  ships  correctly  classified.  Each  data  entry  in  a  table  corresponds  to  180  test 
images  (case  A  =  all)  or  110  test  images  (case  B  =  broadside  views). 

4.1  Effect  of  First-Level  Estimator 

In  Tables  3  and  4,  we  show  the  results  of  tests  performed  with  and  without  the  first-level 
classifier  enabled.  As  seen  from  the  last  column,  excellent  performance  (above  98%  correct 
classification)  is  obtained  if  the  aspect  estimator  is  used.  This  is  expected  since  the 
second-level  classifier  does  not  provide  aspect  estimates  and  without  this  different  ships 
at  different  aspect  views  have  similar  moments. 


TABLE  3 

Effect  of  First-Level  Classifier  on  Performance  (Broadside  views,  Case  B) 


TEST 

NO. 

TEST  CONDITIONS 

AVERAGE  NUMBER  OF 
REFERENCE  VECTORS  PASSED  TO 
SECOND-LEVEL  CLASSIFIER 

PERCENT 

CORRECTLY 

CLASSIFIED 

mm 

First-Level  Not  Used 

20 

Bfl 

First-Level  Class  Estimator  Not  Used 

8.12 

mm 

Aspect  Estimator  Not  Used 

4.11 

98.2 

1 

Both  Estimators  Fully  Used 

1.75 

98.2 

4.2  Computational  Load  with  First-Level  Classifier 

In  column  3  of  Table  4,  the  number  of  reference  vectors  for  which  the  second-level  clas¬ 
sifier  must  be  tested  is  listed.  There  are  a  maximum  of  four  aspect  views  in  each  of  the 
five  classes.  These  data  correlate  well  with  the  percent  of  objects  correctly  classified. 

The  fewer  view  classes  passed  to  the  second-level  classifier,  the  better  the  system  performs. 
In  test  1,  all  20  view  classes  are  passed  to  the  second-level  classifier  (i.e.  all  four  as¬ 
pect  views  of  all  five  classes,  since  no  first-level  estimator  was  used) .  In  test  2,  with 
only  the  aspect  estimator  used,  we  might  expect  five  view  classes  to  be  passed  (the  number 
of  object  classes) .  The  larger  average  number  of  8  view  classes  passed  reflects  the  inde¬ 
cision  in  the  aspect  ratio  test  with  the  larger  threshold  of  1.5  used  (versus  passing  only 
the  best  aspect  estimate  per  class) .  In  test  3,  the  aspect  estimator  is  disabled  and  thus 
we  might  expect  four  view  classes  to  be  passed.  This  is  close  to  the  average  number  ob¬ 
tained.  The  data  in  Table  4  is  quite  comparable  to  that  in  Table  3  with  only  slightly  lower 
percent  correct  performance  obtained  (due  to  the  larger  180  versus  110  number  of  test  images 
used  and  the  low  resolution  of  the  bow  and  stern  views  now  included) . 


TABLE  4 

Effect  of  First-Level  Classifier  on  Performance  (All  Image  Views,  Case  A) 


TEST 

NO. 

TEST  CONDITIONS 

AVERAGE  NUMBER  OF 
REFERENCE  VECTORS  PASSED  TO 
SECOND-LEVEL  CLASSIFIER 

PERCENT 

CORRECTLY 

CLASSIFIED 

i 

First-Level  Not  Used 

20 

36.7 

2 

First-Level  Class  Estimator  Not  Used 

7.24 

37.2 

3 

Aspect  Estimator  Not  Used 

4  .  78 

86.7 

4 

Both  Estimators  Fully  Used 

1  .72 

86.7 

The  first-level  estimator  is  thus  useful  to  reduce  the  number  of  view  classes  to  be  pro¬ 
cessed  by  the  second-level  classifier  and  hence  the  computational  load  on  the  system.  The 
aspect  estimator  is  the  most  important  part  of  the  first-level  classifier,  because  of  the 
nature  of  the  second-level  classifier.  In  general,  if  the  first-level  classifier  does  not 
perform  well,  the  second-level  classifier  cannot  improve  performance.  In  the  tests  performed 
in  Tables  3  and  4,  a  convergence  threshold  T  =  10-4  was  used  and  the  reference  set  was  the 
standard  one  in  Table  2. 

4.3  Convergence  of  the  Second-Level  Classifier 

In  this  test,  we  consider  the  number  of  iterations  necessary  in  the  second-level  classifier 


for  convergence  for  different  thresholds  T.  The  results  (Table  5)  show  that  effectively  the 
same  performance  (98.2%)  correct  results  for  different  convergence  thresholds  T  was  obtained. 
For  the  case  of  all  ship  images  (Case  A  versus  Case  B) ,  a  nearly  constant  87%  correct  class 
performance  was  obtained.  As  expected,  the  number  of  iterations  required  for  (5)  to 
converge  to  the  specified  T  decreases  as  T  increases.  In  no  case  are  more  than  20  iterations 
necessary  however.  Several  modification  details  associated  with  starting  the  algorithm  and 
choosing  the  step  size  were  incorporated  to  insure  such  convergence.  Other  refinements  in 
the  step  size  choices  in  (3)  can  reduce  the  number  of  iterations  in  the  second-level  classi¬ 
fier  by  a  factor  of  two  (for  the  databases  tested  thusfar) . 

TABLE  5 

Effect  of  Convergence  Threshold  T  on  the  Number  of  Second-Level  Class  Iterations 

( Cbse  B,  Broadside  Test  Images) 


TEST 

NO. 

CONVERGENCE 
THRESHOLD  T 

PERCENT 
CORRECTLY 
CLASSIFIED 
OUT  OF  110 

NO.  OF  SECOND-LEVEL 
ITERATIONS  PER  VIEW  CLASS  i 

n 

IQ"4 

98.2 

17.04 

10'3 

98.2 

16.00 

■ 

10-2 

98.2 

14.77 

m 

10'1 

98.2 

13.30 

5 

0.5 

98.2 

2.0 

6 

1.0 

98.2 

2.0 

4.4  Number  of  References  in  the  Second-Level  Classifier 

In  the  prior  data,  only  four  reference  views  per  class  (all  in  one  quadrant)  were  used 
and  excellent  98%  (Case  B)  or  87%  (Case  A)  correct  performance  was  obtained.  In  Table  6,  we 

consider  the  performance  obtained  when  more  aspect  reference  views  per  class  were  used  in 

the  second-level  classifier.  Tests  3  and  4  employ  all  18  aspect  views.  The  results  shown 
are  as  expected.  The  excellent  original  performance  98%  and  86.75;  were  improved  by  only 
1-4%  by  increasing  the  number  of  aspect  reference  images  per  object  class  by  a  factor  of  4.5 
(from  4  to  18).  In  tests  on  other  images  [4]  with  less  symmetry,  poorer  performance  resulted 
unless  reference  images  in  two  quadrants  were  used  in  the  reference  set  for  the  second-level 
classifier.  Thus,  the  exact  results  obtained  depend  upon  the  data  and  its  symmetry.  In 
general,  a  reduced  size  reference  set  can  be  used.  If  the  number  of  aspect  references  is 

reduced,  the  accuracy  in  the  aspect  angle  estimate  may  also  be  reduced.  For  the  cases  con¬ 

sidered,  interpolation  between  different  aspect  views  is  possible  to  provide  view  angle 
estimates  with  10°  accuracy  using  a  reduced  reference  set.  The  sign  of  an  odd-order  moment 
can  provide  quadrant  information  on  the  aspect  of  an  unknown  test  input  object. 

TABLE  6 

Effect  of  Reference  Set  Size  in  the  Second-Level  Classifier 


TEST 

NO. 

REFERENCE  SET 

PERCENT 

CORRECT 

(OUT  OF  110  &  180) 

1 

10°, 30°, 50°, 80° 

98.2 

2 

10° ,30° ,50° , 80° 

86.7 

3 

Even  Aspect  Views 

99.1 

4 

Even  Aspect  Views 

91  .  1 

4.5  Weighting  Matrix  Estimates 

The  final  test  run  concerned  the  weighting  matrix  Z  used  in  the  second-level  classifier. 
The  choices  considered  were  ^  and  W  W?  with  W  calculated  from  the  two  dominant  Fisher  vectors 
or  from  the  four  dominant  Fisher  vectors  for  all  target  views  or  only  the  broadside  views. 

The  results  show  that  over  90%  correct  recognition  was  obtained  with  only  the  identity  ma¬ 
trix  used  for  the  approximation  to  I.  Use  of  the  full  four  Fisher  vectors  gave  only  2% 
better  performance.  In  all  earlier  data  tests  shown,  the  identity  matrix  was  employed  as  an 
approximation  to  I. 


5. 


SUMMARY  AND  CONCLUSION 


A  new  two-level  classifier  has  been  described  that  uses  the  geometrical  moments  as  the 
feature  set.  These  features  are  JGRVs  and  thus  allow  use  of  a  Bayesian  classifier  with  only 
one  training  set  image  per  view  class  required.  A  nonlinear  iterative  algorithm  is  used  in 
the  second-level  classifier  to  obtain  the  final  class  estimate  and  object  distortion  param¬ 
eters.  To  reduce  the  number  of  view  classes  to  be  searched,  first-level  aspect  and  class 
estimators  are  used.  The  aspect  estimator  simply  employs  the  ratio  120^02  to  select  only 
views  with  a  similar  aspect  ratio.  An  organized  hierarchical  tree  search  is  used  to  obtain 
class  estimates.  Multi-class  Fisher  projections  are  used  to  define  the  nodes  in  the  tree 
and  two-class  Fisher  vectors  are  used  to  determine  the  subset  at  each  node  during  testing. 

In  all  cases,  the  computational  load  is  quite  low:  the  first-level  classifier  requires  only 
several  vector  inner  products,  the  second-level  classifier  requires  approximately  18000  opera¬ 
tions  per  iteration  and  fewer  than  15  iterations  per  view  class.  Thus,  a  quite  efficient 
and  attractive  feature-space  object  classifier  results  with  excellent  performance  (over  90% 
correct  recognition)  for  a  five-class  problem  with  aspect  view  object  distortions  present. 
All  parameters  of  the  classifier  have  been  examined  and  quantified  for  a  ship  image  database 
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A  two-level  classifier  has  been  designed  for  use  in  a  moment-based  hybrid  optical/ digital  processor.  The  simulation  per¬ 
formance  of  this  pattern  recognition  system  using  real  IR  input  test  images  of  ships  and  reference  moments  obtained  from 
ship  models  is  described  with  emphasis  given  to  the  preprocessing  operations  required. 


1.  Introduction 

The  use  of  optical  processors  to  compute  image 
features  for  feature-based  pattern  recognition  has  re¬ 
cently  received  renewed  interest.  The  optically-com¬ 
puted  image  features  thus  far  considered  include 
Fourier  coefficients  [1  -3] ,  chord  histogram  distribu¬ 
tions  [4,5] ,  and  geometrical  moments  [6-8) .  In  this 
paper ,  a  moment -based  feature  extractor  and  classifica¬ 
tion  algorithm  for  pattern  recognition  is  considered 
(section  2)  and  its  performance  in  the  classification  of 
ship  imagery  (section  3)  is  addressed.  Specific  atten¬ 
tion  is  given  to  classification  of  real  input  imagery 
(section  5)  and  the  image  preprocessing  required  (sec¬ 
tion  4). 


mpq =  //  f(x>y)xPyq  dx  d-V 


2.  Optical  computation  of  the  geometrical  moments 

The  optical  system  considered  to  generate  the  mo¬ 
ments  of  an  input  object  [7]  consists  of  an  input  plane 
P j  (in  which  the  input  image  is  placed)  imaged  onto  a 
moment  generating  mask  at  plane  P2.  The  monomials 
xPyd  up  to  fifth-order  (p  +  q  <  5)  are  recorded  on  the 
P2  mask  each  spatially  multiplexed  using  a  different 
spatial  frequency  for  each  carrier.  The  optical  Fourier 
transform  of  the  light  distribution  leaving  P2  is  de¬ 
tected  on  21  multiple  parallel  output  detectors  in  the 
P3  output  plane  and  contains  the  moments 
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of  the  Pj  input  pattern  f(x,y)  as  detailed  in  [7] . 

These  optically -generated  image  features  are  used 
as  inputs  to  a  digital  feature-based  classifier  which  then 
determines  the  object  class  and  the  orientation,  scale 
and  aspect  of  the  input  object.  The  details  of  this  clas¬ 
sifier  are  provided  elsewhere  [8]  and  are  not  germaine 
to  our  present  discussion,  however  several  remarks  on 
the  classifier  follow  for  completeness.  The  optically- 
calculated  input  moment  vector  m  is  projected  by  the 
first-level  classifier  in  the  digital  section  onto  a  multi¬ 
dimensional  Fisher  feature  space  [9] .  From  the  loca¬ 
tion  of  the  projection  vector,  initial  estimates  of  the 
input  object  class  are  made.  From  the  ratio  of  the  nor¬ 
malized  second-order  moments  p20  and  p02,  an  esti¬ 
mate  of  the  aspect  ratio  or  aspect  angle  of  the  input 
object  is  made.  These  estimates  are  used  to  select  ref¬ 
erence  vectors  m,(0)  for  class  f  and  aspect  6  from  stor¬ 
age  against  which  m  is  compared.  The  final  decision 
on  the  object  class  and  the  geometrical  location  of  the 
input  object  is  made  in  a  second -level  classifier  imple¬ 
menting  a  nonlinear  least-squares  solution  as  detailed 
in  [8] .  Our  present  concern  is  the  preprocessing  re¬ 
quired  on  real  images  before  their  moments  m  can  be 
reliably  extracted. 
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As  our  reference  database  we  used  180  images  of 
five  types  of  ships  with  36  images  available  per  ship 
(at  10°  intervals  around  each  ship,  from  a  90°  depres¬ 
sion  angle).  This  reference  database  was  obtained  from 
ship  models  under  controlled  conditions.  Each  image 
contains  128  X  32  pixels  with  about  2000  pixels  cor¬ 
responding  to  the  ship  (for  the  broadside  view)  and 
less  than  200  ship  pixels  (for  the  bow  and  stern  views). 
The  moments  of  4  images  per  class  (10°,  30°,  50°  and 
80°,  where  0°  is  the  bow  view  and  90°  is  the  broadside 
view)  constituted  our  reference  /n,(0)  database.  As  test 
data,  we  used  various  real  images  of  the  class  2  ship 
(the  Leahy).  A  typical  image  is  shown  in  fig.  1 .  It 
shows  the  ship  in  water  with  a  sky  and  shoreline  back¬ 
ground.  We  used  256  X  128  pixel  images  with  8  bits 
of  gray  scale  for  the  real  ships  in  our  tests.  The  hori¬ 
zon  (separating  the  water  and  the  sky  background)  is 
seen  and  the  depression  viewing  angle  for  the  real 
images  is  80°  (rather  than  90°,  as  in  the  reference 
imagery).  The  real  image  (from  bottom  to  top)  con¬ 
tains  four  regions.  (1 )  water,  (2)  the  hull  of  the  ship 
and  some  water,  (3)  the  superstructure  of  the  ship  with 
a  water  background,  and  (4)  the  sky  and  shoreline  at 
the  top  of  the  image.  In  section  4,  we  detail  the  pre¬ 
processing  used  to  extract  the  ship  from  the  back¬ 
ground  and  in  section  5,  we  discuss  the  classification 
performance  obtained  on  such  imagery. 


4.  Image  preprocessing 

Feature-extraction  pattern  recognition  algorithms 
require  that  one  object  location  within  the  input  field- 


Fig.  1.  Typical  ship  test  image  (the  guided-missile  cruiser,  the 
Leahy,  ship  class  2). 


Number 
lof  Pixels 


Ship 
and  sky 


150  168 


Fig.  2.  Bimodal  gray-level  histogram  of  fig.  1. 

of-view  be  extracted  before  the  features  are  computed. 
These  operations  are  most  commonly  referred  to  as 
segmentation  and  also  involve  noise  removal  and  filling 
in  of  holes  on  the  object  [10] .  Care  should  be  taken 
to  employ  only  simple  image  preprocessing  operations 
that  are  not  computationally  expensive.  Thus,  we  used 
mainly  histogram  operations  (since  they  require  only 
simple  tallies  of  image  pixel  levels)  to  aid  in  threshold 
selections.  A  wealth  of  such  methods  exist,  but  their 
specific  implementations  are  quite  problem-dependent. 
In  our  case,  we  used  context  information  (the  water  is 
below  the  ship,  the  sky  is  above  the  ship  and  the  deck 
line  and  horizon  are  nearly  horizontal  due  to  the  sen¬ 
sor  system  used)  to  greatly  simplify  the  ship  segmenta¬ 
tion.  Our  approach  is  quite  novel  in  the  techniques 
employed  to  select  separate  thresholds  for  the  differ¬ 
ent  image  regions  and  dynamically  select  these  regions 
based  on  the  scene  information.  Such  methods  are  of 
use  in  feature  extractors  for  diverse  applications. 

As  step  1 ,  we  formed  the  gray-level  histogram  of 
fig.  1  (see  fig.  2).  It  was  bimodal  as  expected  extending 
from  0  to  255  (8  bits).  A  broad  peak  exists  at  low 
pixel  values  (corresponding  to  the  water  and  noise, 
which  is  low  in  intensity  in  fig.  I )  and  a  sharper  peak 
is  centered  at  the  high  175  pixel  level  (corresponding 
to  the  ship  and  the  sky,  whose  pixel  values  are  larger 
in  fig.  1 ).  A  well-defined  valley  at  pixel  level  1 50  exists. 
Thus,  at  step  2,  we  thresholded  the  image  at  1 50  (with 
all  pixel  values  below  1 50  set  to  zero  and  all  pixel  val¬ 
ues  above  150  set  to  one).  The  resultant  binary  image 
is  shown  in  fig.  3. 

At  step  3,  the  image  in  fig.  3  is  used  to  estimate  the 
location  of  the  four  image  regions  defined  in  section 
3.  To  achieve  this,  a  horizontal  or  row-projection 
histogram  of  fig.  3  is  formed.  This  is  a  graph  (fig.  3)  of 
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sky,  shoreline  and  water  and  thus  extracts  the  ship.  If 
the  gray -levels  above  Vj  are  retained,  a  gray-scale  seg¬ 
mented  ship  image  results.  If  levels  above  FT  are  set  to 
unity,  a  binary  segmented  ship  image  results  (fig.  6). 
Simple  median  filtering  or  other  local  convolution  op¬ 
erations  can  be  used  to  suppress  miscellaneous  noise 
pixels  remaining  in  the  background  and  to  fill  in  holes 
on  the  target  object. 


5.  Image  classification 

The  moments  m  of  the  image  in  fig.  6  were  com¬ 
puted  and  fed  to  our  digital  first-level  Fisher  projection 
class  estimator.  This  first -level  classifier  omitted  class  1 
and  3  ships  as  possible  class  matches.  The  second-level 
classifier  returned  class  2  as  the  most-likely  object 
class.  This  classifier  also  provides  confidence  levels  for 
each  possible  ship  class  (classes  2,  4  and  5)  passed  by 
the  first-level  classifier.  The  class  4  ship,  another 
guided-missile  cruiser,  had  the  second-best  confidence 
but  it  was  quite  worse  than  that  of  the  best  (and  cor¬ 
rect)  class  2  match.  The  correct  aspect  angle  (70°)  and 
scale  (SO'/) of  the  input  object  are  also  provided  by 
the  classifier. 


6.  Summary  and  conclusion 

A  necessary  aspect  of  feature  extractors  for  pattern 
recognition  is  the  image  preprocessing  required.  A 
novel  digital  segmentation  preprocessing  procedure  of 


quite  general  use  was  detailed  for  a  ship  pattern  recog¬ 
nition  scenario.  Such  operations  are  essential  if  optical 
or  digital  feature  extraction  processors  are  to  achieve 
good  performance.  The  successful  classification  of  a 
real  input  image  using  moment  features  and  a  unique 
two-level  classifier  was  demonstrated.  Similar  results 
were  obtained  for  other  real  images. 
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1.  INTRODUCTION 

A  feature  space  processor  for  multi-class  distortion-invariant  pattern  recognition 
is  detailed  in  Section  2.  A  moment  feature  vector  space  is  considered.  Test 
data  [1.2]  on  a  robotic  database  are  summarized  in  Section  3.  Results  on  a  ship 
database,  using  real  input  imagery  with  references  from  models  is  presented  with 
attention  to  preprocessing,  distortion  parameter  estimation,  and  class  identification  are 
advanced  in  Section  4. 

2.  PROCESSOR 

A  moment  feature  space  is  easily  generated  optically  [3,4,5]  or  digitally  [6], 
Its.  outputs  can  easily  be  corrected  for  processing  errors  in  post-processing  [3]. 
Moments  are  jointly  Gaussian  random  variables  [2]  due  to  sampling  with  respect  to 
in-plane  distortions.  Thus,  they  allow  use  of  a  Bayesian  classifier  and  thus  can 
minimize  P#  To  determine  the  class  i  (object  class  c  and  aspect  view  4>)  and 
the  object's  distortions  (described  by  a  distortion  parameter  b)  for  each  computed 
input  moment  vector  m,  we  calculate 

g,  =  rm  -  m(b)]Tr'1[m  -  rrr(b)],  (’) 

with  b  calculated  iteratively  (k  is  the  iteration  index)  using 


■\k*i  , 


bh  ♦  [(jVr’/yvrr^m  -  m(b)]. 


The  class  i  that  minimizes  (1)  defines  c  and  the  out-of-plane  rotation  angle 

(aspect)  4>  ot  the  input,  whereas  b  provides  estimates  of  translations,  scales,  and 

in-plane  rotations  The  number  of  iterations  k  can  be  reduced  to  4-6,  E  *  I  can 

be  used  in  (1)  and  (2),  and  J  in  (2)  calculated  as  an  update  [1.2].  "  This 

significantly  reduces  the  computatioTial  load  per  class/aspect  i. 


The  major  problem  is  the  large  number  of  aspect-classes  i  that  need 

potentially  be  searched  To  relieve  this,  we  use  two  first-level  estimators  [1.2]  to 

estimate  the  aspect  (this  is  achieved  by  A  *  P  20^02)  and  class  (a  hierarchical 
tree  is  used  for  this,  with  the  node  structure  chosen  from  a  multi-class  Fisher 
projection  and  with  a  two-class  Fisher  discriminant  vector  used  per  node).  As  we 

show  in  Section  3,  this  reduces  the  number  of  aspect-classes  i  to  be  searched 

and  thus  makes  the  processor  very  computationally  efficient.  A  block  diagram  of 
the  system  is  shown  in  Figure  1. 

3.  PIPE  PART  TEST  RESULTS 

Nine  different  pipe  parts  (4  classes)  viewed  from  a  50°  depression  angle 
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Figure  1:  Block  diagram  ol  a  multi-level  moment  feature-space  classifier 

were  digitized  (128  x  128  pixels)  with  36  images  per  part  (one  image  every  10° 
in  aspect)  and  used  as  our  test  database.  Test  results  are  summarized  in  Table 
1.  They  show:  9  out  of  36  references  are  adequate  (Test  1).  Use  of  the 
first-level  estimator  reduces  the  number  of  i  to  be  searched  in  (1)  to  10  (Test 
2)  '  The  number  of  iterations  k  in  (2)  is  only  6  over  a  large  4g(  range  (Test  3) 
and  ;  :  I  in  (1|  and  (2)  is  adequate  (Test  4).  As  seen  in  Table  1,  the  system 
of  Figure  1  can  correctly  classify  over  97%  Of  the  324  images  (using  only  9x4 
*  36  references). 
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Table  1:  Representative  Pipe  Pari  Data  (Different  Test  Conditions) 

4.  DISTORTION  PARAMETER  ESTIMATION  ACCURACY 

Related  tests  on  another  database  [2,7]  showed  comparable  performance  and 
similar  operational  parameters  In  this  database,  the  reference  objects  were 
obtained  from  models  and  in  tests  against  real-world  IR  images,  excellent 
recognition  was  obtained.  The  preprocessing  required  [7]  used  only  simple  ID 
and  2D  histogram  operations  and  thresholding  (to  maintain  low  computational 
overhead). 

We  now  consider  the  class  c,  aspect  scale  a  and  translation  Xq  estimation 
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accuracy  of  the  system  for  a  second  five-class  database  (36  images  at  10° 
aspect  intervals  per  clas)  using  only  four  references  per  class.  The  true  object 
was  the  80°  aspect  view  of  the  class  1  image.  A  real  IR  input  image  (vs. 
references  obtained  from  models)  at  a  depression  angle  10°  different  from  that  of 
the  reference  set  was  used  with  real  IR  noise  present  in  the  input.  The  tests 
(Table  2)  show  perfect  class  and  aspect  classification  for  Agt  *  10'4  -  10*1  (for 
jg(  =  0  5,  errors  resulted  as  expected)  and  excellent  shift  (xQ  in  pixels)  and  scale 
factor  (o)  distortion  parameter  estimation.  All  distortion  parameters  were  estimated 
within  5%  accuracy,  due  to  the  input  resolution,  noise,  etc.  factors. 
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Table  2:  Results  of  Class  and  Distortion  Estimation  Tests 

(True  Class  1,  Aspect  80°) 
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to  be  achieved  in  optical  correlators,  thus  making  such  systems  more  practical. 
The  synthesis  of  four  different  types  of  SDFs  is  reviewed.  Their  performance  in 
extensive  projection  tests  on  144  images  is  presented,  together  with  initial 
performance  tests  in  the  presence  of  noise.  This  distortion-invariant  and  shift- 
invariant  pattern  recognition  algorithm  can  also  be  implemented  digitally. 
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1.  INTRODUCTION 

The  frequency  plane  optical  matched  spatial  filter  ( MSF)  correlator' 
has  been  the  most  studied  optical  pattern  recognition  system  of  the 
last  20  years.  The  sensitivity  of  the  MSF  to  geometrical  distortions 
between  the  input  and  reference  object  is  a  well-known  shortcoming 
of  any  correlator.  The  use  of  multiple  MSFs  can  reduce  such  prob¬ 
lems  at  the  cost  of  increased  system  complexity  and  reduced  light 
budget  efficiency.2  Special  frequency  plane  weighting  and  filter  syn¬ 
thesis  techniques  can  reduce  this  sensitivity,  but  cannot  overcome  it.2 
Space-variant  correlators4  and  coded-phase  processors9  can  over¬ 
come  various  distortions,  but  at  the  expense  of  shift  invariance  and 
multitarget  recognition  (although  they  still  retain  the  processing  gain 
advantages  of  a  correlator). 

In  this  paper,  we  describe  the  synthesis  and  performance  of  MSFs 
formed  from  synthetic  discriminant  functions  (SDFs).  These  linear 
combination  filters  retain  the  shift  invariance  and  processing  gain  of 
correlators  while  overcoming  their  sensitivity  to  geometrical  distor¬ 
tions.  In  Sec.  2,  we  review  the  synthesis  techniques  for  four  different 
SDFs.4  In  Sec.  3,  we  describe  the  data  base  used  in  our  projection 
simulations.  Our  noise-free  results7  are  summarized  in  Sec.  4.  and 
initial  performance  in  the  presence  of  noise'  is  presented  in  Sec.  3. 
Other  variants  of  SDFs  also  exist4  1 1 ;  however,  SDFs  are  the  most 
general  and  widely  tested  of  such  filters. 

Invited  Piper  PR-10*  received  April  J.  10*4;  reviled  nunuicnpi  received  Apnl  V.  1084, 
accepted  tor  publication  June  23.  1084.  received  by  Managing  Editor  Aug  23.  1084 
*  10*4  Society  of  Photo-Optical  Instrumentation  Engineers 


2.  SDF  SYNTHESIS 

A  unified  SDF  synthesis  technique  was  first  advanced  in  Ref.  1 2  and 
recently  was  more  fully  described.4  For  completeness,  we  briefly 
review  the  different  SDF  synthesis  methods.  The  basic  concept  of 
SDF  synthesis  is  to  utilize  a  training  set  of  images  of  each  object 
class.  From  the  correlation  matrix  of  the  full  training  set.  we  synthe¬ 
size  a  SDF  h(x ,  y)  that  is  a  linear  combination  of  the  training  set  of 
images.  Depending  upon  the  purpose  of  the  filter,  different  condi¬ 
tions  will  be  placed  on  h,  and  different  SDFs  can  be  synthesized. 

The  simplest  derivation  of  a  SDF  occurs  for  the  case  of  one  filter  h 
that  is  to  yield  a  constant  correlation  output  c  =  I  for  all  versions  |fn) 
of  objects  f  in  one  class;  i.e., 

fn  ®  h  =  c  =  I  ,  (I) 

where  ®  denotes  correlation.  We  restrict  h  to  being  a  linear  combina¬ 
tion  of  the  |fn);  i.e., 

h=5>m'm-  <2> 

For  notational  simplicity,  we  do  not  show  the  spatial  dependence  of 
the  functions  f  and  h  in  Eqs.  ( I )  and  (2).  We  rewrite  Eq.  ( I )  for  the 
projection  case  (i.e.,  the  central  value  of  the  correlation  output)  as 
•  h  =  I  (where  vectors,  denoted  by  boldface  type,  now  describe  each 
function).  Substituting  Eq.  (2)  into  Eq.  ( I),  we  find 

^  =  ^n'Sam^m  =  2*mrnm  =  *  •  ^ 

where  rnm  denotes  the  elements  of  the  correlation  matrix  R  for  |fn). 
In  matrix-vector  notation,  Eq.  (3)  becomes 

R  a  =  u  ,  (4) 

where  u  is  the  unit  vector  (i.e.,  u  contains  all  “  I  "elements).  The  filter 
h  that  satisfies  Eq.  ( I )  is  thus  defined  by 
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a  =  R-’  u  . 


(5) 


We  denote  the  SDF  in  Eq.  (I),  defined  by  Eq.  (5),  as  an  equal 
correlation  peak  (ECP)  SDF.  it  is  only  of  use  in  intraclass  pattern 
recognition.  The  remaining  types  of  SDFs4  are  described  quite  sim¬ 
ilarly  to  Eq.  (S).  with  different  training  sets,  correlation  matrices,  and 
exogenous  vectors  u  used. 

For  both  intraclass  recognition  and  interclass  discrimination,  a 
SDF  is  chosen  to  recognize  objects  (fj)  in  class  i  with  unit  output 
and  to  yield  zero  output  for  objects  (fj)  in  other  classes  j;  i.c., 

4  ®  hj  —  •  (6) 

This  mutual  orthogonal  function  (MOF)  SDF  for  a  two-class  prob¬ 
lem  is  synthesized  using  a  training  set  of  N,  images  ff,n)  of  class  I 
objects  and  N2  images  |f2n)  of  class  2  objects  as 


~  2  *m  4i  • 

m 

_  (7) 

•>2  =5>mfm  • 

m 


where  4, -h,  =  I  for  objects  in  class  1 ,  f m  -  fa  |  =0  for  objects  in  class  2, 
and  the  summations  inEq.  (7)  are  over  l<m<N,  +  N2.  The 
matrix-vector  solutions  for  the  a,,,  and  bm  in  Eq.  (7)  are 

•  =  *7!j  "i  • 

(8) 

b=  R.'juj  . 


where  R,  2  is  the  (N ,  +  N2)  X(N ,  +  N2)  correlation  matrix  of  the  full 

training  set  of  data,  and  where  ■,  =  [1 . 1 ,0 . 0]T  and  = 

[0 . 0,1 . 1]T.  In  M,.  there  are  N,  ones  (for  the  N,  class  I 

objects)  and  N2  zeros  (for  the  N2  class  2  objects).  Similarly,  Uj  has  N, 
zeros  and  N2  ones.  The  extension  of  Eqs.  (6),  (7),  and  (8)  to  more  than 
two  object  classes  follows  directly.4  These  MOF  SDFs  require  one 
filter  per  object  class  and  a  correlation  matrix  R  of  larger  order  than 
in  Eq.  (S). 

In  some  recognition  cases,  a  single  multilevel  nonredundant  filler 
SDF  h  (we  use  the  simpler  term  “multilevel  SDF")  can  be  used  to 
recognize  multiple  object  classes.  The  general  requirement  for  such  a 
filter  can  be  written  as 

4  ®  h  =  n  •  (9) 

i.e.,  the  value  n  of  the  correlation  output  defines  the  class  n  of  the 
input  object.  For  a  three-class  intraclass  and  interclass  recognition 
and  discrimination  problem,  we  use  N,.  N2,  and  N3  training  set 
images  in  classes  I.  2,  and  3,  respectively.  The  filter  is  defined  by 

h  =  2»mfm  •  (10) 

where  the  summation  is  over  1  <m<N,  -I-  N2  +  N3,  i.e.,  the  full 
training  set.  The  filter  in  Eq.  (10)  satisfying  Eq.  (9)  is  defined  in 
matrix-vector  notation  by 

(M) 

where  R3  is  the  N ,  +  N2  +  N,  correlation  matrix  for  the  full  training 
set  and  =[!,...,  I ,2 . 2.3 . 3]hasN,  ones, N2 twos, and N3 

threes. 

The  final  class  of  SDFs  is  the  K-tuple  two-level  nonredundant 
multiple  filter  SDF  (we  use  the  simpler  term  “K-tuple  SDF”).  We 
describe  such  filters  for  the  f-ur-class  (N  =  4)  two-filter  (K  =  2)  case 
(i.c.,  2*  =  N).  The  four  object  classes  are  denoted  by  (f( },  (f2),  etc., 
and  the  two  filters  by  ha  and  hb.  The  object  class  is  determined  from 
the  outputs  from  both  filters,  as  in  Table  1.  For  simplicity,  binary 
(0,1)  values  are  used.  Other  values  are  preferable  since  (0 ,0)  can  also 


TABLE  I.  Truth  Table  for  K-tuple  Two-Level  Four-Class  SDF 


Input/Output 

n 

1*1 ) 

0 

0 

l«2l 

0 

1 

(f3l 

1 

1 

«4l 

I 

0 

indicate  no  input.  In  practice,  we  select  K  to  satisfy  2K  ^  N  +  I  and 
thus  avoid  the  ambiguity  possible  in  the  (0,0)  output  case. 

Synthesis  of  hg  and  hb  to  satisfy  Table  1  follows  directly .  The  two 
filters  are  linear  combinations  of  the  full  training  set  of  data: 

h.  =  2amfm  • 

(12) 

»>b  =  S>mfm  ■ 


The  coefficients  in  Eq.  (12)  are  defined  by 
a  =  r;1  u4g  , 

b  =  r;1  u4b . 


(13) 


where  R  is  the  correlation  matrix  of  the  full  training  set  of  images  for 
all  four  classes  (N , ,  N2,  N3,  and  N4  images  in  each  class,  respectively). 
The  vector  Hgj  in  Eq.  (13)  has  N,  +  N2  zeros  and  N3  +  N4  ones  (for 
hg),  with  it4b  being  similar. 

Other  obvious  combinations  of  these  four  basic  types  of  SDFs 
follow  directly.  In  all  cases,  the  filter  function  is  of  the  form  in  Eq.  (3), 
with  a  different  correlation  matrix  R-  and  exogenous  vector  u,,.  This 
unified  SDF  synthesis  method  significantly  simplifies  off-line  syn¬ 
thesis  of  the  SDF.  Since  the  projection  values  for  the  different  classes 
in  the  different  SDFs  are  fixed  by  the  synthesis  algorithm,  we  refer  to 
such  filters  as  deterministic  SDFs. 

3.  DATA  BASE 

The  proper  evaluation  of  the  performance  of  various  SDFs  described 
in  Sec.  2  in  this  paper  is  a  key  new  detail.  SDFs  require  a  large  data 
base  to  properly  select  training  set  images  that  are  valid  statistical 
representations  of  the  data  in  each  object  class,  with  a  sufficient 
number  of  additional  test  images  (not  in  the  training  set)  remaining 
to  allow  sufficiently  valid  tests  on  the  algorithm.  In  all  experiments 
performed,  the  computational  load  was  so  large  that  only  correlation 
plane  projection  values  (i.e.,  the  correlation  value  at  the  point  of 
registration)  were  evaluated  as  in  Eq.  (3). 

Our  most  extensive  multiclass  object  data  base  available  con¬ 
sisted  of  images  of  four  ships  from  90°  depression  angle  with  36  views 
available  per  ship  (at  10°  intervals  in  aspect  around  the  full  360°  of 
the  ship).  In  Fig.  I,  we  show  the  broadside  views  of  the  four  ships. 
Clearly,  other  aspect  views,  such  as  the  bow  and  stern,  contain 
significantly  less  object  data.  The  images  were  each  128  X  32  pixels. 
For  the  broadside  views,  the  target  contained  approximately  1200 
pixels  out  of  4000  pixels  in  the  full  frame.  For  the  bow  and  stem 
views,  about  200  pixels  (out  of  4000)  were  present  on  the  target.  The 
classes  assigned  to  each  ship  and  the  name  and  type  of  each  are  noted 
in  Table  II.  The  images  in  class  I  are  numbered  I  through  36  ( I  is  the 
bow,  18  is  the  stern,  etc.).  Class  2  images  are  numbered  37  through 
72,  etc.  All  images  were  binarized  to  only  “0"  and  “  I "  valued  pixels, 
with  the  threshold  selected  from  simple  histogram  operations. 15  Two 
sets  of  images,  one  with  and  one  without  the  hull  present,  were 
formed  and  used.  This  image  data  base  allows  the  3-D  aspect  distor¬ 
tion  invariance  of  our  SDF  correlator  to  be  verified  and  its  perfor¬ 
mance  to  be  quantified. 

4.  NOISE-FREE  PROJECTION  TEST  RESULTS 

In  Table  III,  we  summarize  our  digitally  simulated  SDF  projection 
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performance  obtained  for  the  four  major  types  of  SDFs  described  in 
Sec.  2,  using  the  first  four  ship  image  data  bases  described  in  Sec.  3. 
Six  different  tests  using  different  SDFs  were  considered.  The  type  of 
SDF  used  and  the  six  training  set  images  used  per  class  are  noted  in 
the  table.  The  number  of  errors  obtained  out  of  the  36  images  in  each 
of  the  four  object  classes  is  noted,  together  with  the  percentage 
correctly  recognized. 

In  test  I ,  the  ECP  SDF  in  Eq.  (5)  was  formed  using  only  six  class  I 
images  and  was  tested  against  all  36  aspect  views  of  the  class  1  object. 
All  36  projection  values  were  within  3%  of  the  deterministic  value  of 
unity  selected  in  Eq.  (5),  and  thus  100%  correct  performance  was 
obtained.  Test  2  was  similar  for  the  class  2  object,  and  again  100% 


*9  1  (a)  -(d)  Broadaida  views  of  tha  data  baaa  image*  in  claaaaa  1  to  4. 
reap  actively. 


TABLE  II.  Ship  Image  Data  Baaa  Uaad 


Ship  class 

Ship  name 

Type 

1 

Moskva 

Soviet  helicopter  cruiser 

2 

Leahy 

U.S.  guided-missile  cruiser 

3 

Hope 

International  hospital  ship 

4 

Albany 

U.S.  guided-missile  cruiser 

performance  was  obtained  (with  all  36  projection  values  within  $%  of 
1 .0).  These  test  I  and  2  results  show  the  intraclass  recognition  per¬ 
formance  of  SDFs  in  the  face  of  3-D  aspect  distortions.  All  SDFs 
require  a  training  set  that  is  a  valid  statistical  representation  of  the 
object  class  for  the  distortions  considered,  and  in  this  case  10°  to  20° 
variation  in  aspect  can  be  tolerated  (in  agreement  with  experiments 
in  Refs.  2  and  3).  The  six  training  set  images  used  per  class  in  tests  I 
and  2  were  chosen  at  approximately  $0°  intervals  around  the  ship, 
with  three  or  four  images  taken  from  each  side  of  the  ship  (0°  to  1 80° 
and  180°  to  360°  aspect  views). 

In  tests  3  through  6,  the  discrimination  as  well  as  recognition 
performance  of  our  other  SDFs  was  considered.  The  two-class  MOF 
SDF  defined  by  Eqs.  (6)  and  (8)  was  evaluated  first  (test  3)  using  the 
12  training  set  images  in  teats  I  and  2  to  form  the  MOF  SDF.  The 
projection  values  used  in  the  filter  synthesis  were  (1,0).  In  determin¬ 
ing  the  object  class,  the  projection  values  P,  of  each  input  object  on 
only  the  h,  MOF  SDF  in  Eq.  (7)  were  calculated,  and  the  decision  on 
the  input  object  class  was  made  based  on  whether  P,  <0.5  or  whether 
P,  >0.S  (where  the  0.5  threshold  level  was  chosen  as  being  midway 
between  the  original  0  and  I  deterministic  projection  values).  The 
projection  performance  obtained  was  excellent,  with  69  of  72  images 
correctly  classified  (95.8%  correct  identification).  In  test  4,  our  multi¬ 
level  SDF  defined  in  Eqs.  (9)  and  (II)  was  tested  on  three  object 
classes  with  deterministic  projection  values  of  0,  I,  and  2,  respec¬ 
tively,  for  the  three  classes.  Six  training  set  images  per  class  were 
used,  and  excellent  performance  resulted,  as  shown,  with  only  five 
errors  obtained  out  of  the  108  test  images  (95.4%  correct 
performance). 

In  tests  I  through  4,  the  hull  of  the  ship  was  present  in  the  image 
data  base  used.  Comparable  results  occur  if  the  hull  is  not  present.  In 
tests  5  and  6,  our  K -tuple  SDF  was  used  for  the  full  four-class 
recognition  and  discrimination  testing  on  all  144  images.  A  new 
training  set  of  six  images  per  class  was  used  (selected  as  described  in 
Ref.  8)  to  synthesize  the  h,  and  h^  filters,  and  the  image  data  base 
with  the  hull  of  the  ship  removed  was  employed  (since  it  yielded 
better  performance  due  to  more  discriminatory  information  in  the 
superstructure  of  the  ship).  In  test  5,  the  filters  were  synthesized  with 
the  deterministic  projection  values  noted  in  Table  1.  The  results  were 
quite  good,  with  only  14  errors  out  of  144  images  (90.3%  correct 
recognition). 

However,  as  shown,  12  of  the  14  errors  occurred  for  the  class  4 
object.  The  majority  of  these  errors  were  at  the  bow  and  stern,  and  all 
of  these  errors  were  due  to  the  projection  values  on  the  second  filter 
bt,  being  above  the  0.5  threshold  (recall  from  Table  I  that  hb  should 
force  class  4  projections  to  0,  or  to  below  0.5).  Inspection  of  Fig.  I 
shows  that  the  class  4  object  is  the  largest  ship.  Since  it  appears  to  be 
more  difficult  to  force  the  projection  values  of  a  large  object  to  zero 
(compared  to  the  ease  of  forcing  the  projection  values  of  smaller 
objects  to  zero),  we  altered  the  projection  value  choices  in  Table  I  to 
those  shown  in  Table  IV. 

As  seen  in  Table  IV,  the  new  deterministic  projection  value 
choices  have  reversed  the  projection  values  for  class  3  and  4  objects. 
Both  filters  (h,  and  hj,)  are  now  designed  to  yield  projection  values  of 
( I ,  I )  for  the  largest  (class  4)  object.  The  results  for  this  filter  are 
shown  in  Table  III,  test  6.  They  are  excellent,  with  only  two  errors 
out  of  all  144  test  images  (98.7%  correct  classification).  Attempts  to 


TABLE  III.  Noise-Free  SDF  Projection  Performance  Test  Raaulta 


Test 

number 

Type  of  SDF 

Training  set 

Errors  per  class 

12  3  4 

Percent 

correct 

1 

ECP  (class  1 ) 

(1,8,10,15,  20.25) 

0 

- 

- 

- 

100 

2 

ECP  (class  2) 

(38,  45,  SO.  55.  60.  65) 

- 

0 

- 

- 

100 

3 

MOF  (1,0) 

(Same  as  tests  1  and  2) 

1 

2 

- 

- 

958 

4 

Multilevel  (0. 1.2) 

(Six  images  per  class  as  above) 

2 

0 

3 

- 

954 

8 

K -tuple  (Table  1) 

New  training  set  (Six  images  per  class,  hull  removed) 

0 

2 

0 

12 

90.3 

6 

K -tuple  (Table  IV) 

New  training  set  (Set  images  per  class,  hull  removed) 

0 

2 

0 

0 

987 
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obtain  comparable  performance  when  the  hull  was  present  were  not 
successful  for  the  four-class  problem  (since  the  superstructure  of  the 
ship  clearly  contains  the  major  discriminatory  information).  In 
general,  the  amount  of  the  hull  that  is  visible  varies  considerably  with 
the  ship's  load,  and  thus  the  hull  data  cannot  be  reliably  assumed  to 
be  present. 

5.  SDF  PROJECTION  PERFORMANCE  WITH  NOISE 

In  Table  V,  we  summarize  the  performance  of  the  K-tuple  SDF  in 
test  5  of  Table  III  in  the  presence  of  noise.  In  tests  7  through  II, 
Gaussian,  zero-mean  noise  was  added  only  to  the  test  set  of  images, 
and  a  noise-free  training  set  of  images  was  used.  For  the  24  images 
present  in  the  training  and  test  sets,  noise  was  added  only  during 
testing.  In  tests  12  through  15,  noise  was  present  in  the  training  set 
images  also.  The  noise  added  was  of  zero  mean,  with  a  standard 
deviation  a  as  given  in  the  table.  The  noisy  images  were  then  bina¬ 
rized,  and  these  binarized  noisy  images  were  used  in  filter  synthesis 
and  testing.  For  the  data  in  Table  V,  noise  was  present  everywhere 
(i.e.,  in  the  background  and  on  the  target).  Because  of  the  binariza- 
tion,  the  effect  of  noise  is  different  when  present  in  the  background 
and  on  the  object.  In  the  background,  noise  adds  +1  valued  pixels, 
and  on  the  target  it  forces  +1  valued  pixels  to  0.  To  stabilize  the 
results  obtained,  we  chose  not  to  make  a  decision  on  the  object  class 
when  the  projection  values  were  within  ±0.03  of  the  threshold.  These 
“no-decision "cases  are  indicated  in  parentheses.  The  total  number  of 
errors,  total  number  of  correctly  classified  objects,  and  percentage  of 
objects  correctly  classified  (out  of  the  144  test  images  used)are  given 
in  the  table. 

The  results  obtained  require  discussion.  A  reduction  in  the 
number  of  class  4  object  errors  with  increasing  noise  was  observed 
(tests  7  through  10),  up  to  o  =  0.4  noise.  Then,  for  a  =  0.6,  the  total 
number  of  errors  increases  dramatically.  To  explain  this  perfor¬ 
mance,  we  denote  the  projection  value  for  filter  h,  and  object  class  j 
by  P,(j)  We  note  that  all  class  4  errors  in  test  7  are  errors  in  Pb(4)  that 
are  above  0.5  (whereas  they  should  be  0.0).  As  noise  is  added  to  the 
class  4  object,  the  zero-valued  pixels  introduced  on  the  object  cause 
the  Pb(4)  values  to  decrease.  The  +1  valued  pixels  introduced  into 
the  background  also  cause  Pb(4)  to  decrease  (since  much  of  the  upper 
portion  of  the  b|,  filter  is  negative-valued,  as  needed  to  force  the  full 


TABLE  IV.  New  Projection  Value  Choices  Used  in  Test  6  of  Tabla  III 


Input/Output 

a 

1*1  > 

0 

0 

1*2) 

0 

1 

1*3 1 

1 

0 

1*4) 

1 

1 

projection  to  0).  Thus,  as  noise  increases,  the  Pb(4)  projection  values 
decrease.  For  a  =  0.4,  all  Pb(4)  projections  are  now  below  0.5  and 
hence  correct,  and  thus  no  class  4  errors  occur. 

However,  the  Pb(2)  and  Pb(3)  projections  for  filter  hb  on  class  2 
and  3  objects  also  decrease  as  a  of  the  noise  increases.  These  projec¬ 
tion  values  were  intended  to  be  1.0,  and  initially  all  were  quite  close  to 
1.0.  As  o  increases,  their  values  decrease  gradually,  and  at  o  =  0.6 
nearly  all  filter  projections  for  class  2  objects  are  below  0.5.  The 
decrease  in  projection  value  is  quite  gradual,  and  the  sharp  increase 
in  the  number  of  errors  (from  tests  10  to  II)  occurs  because  most 
projections  now  pass  below  the  threshold  level  at  this  o  level  of  noise. 
The  class  2  projection  outputs  all  changed  from  (0, 1)  to  (0,0);  thus, 
all  class  2  errors  in  test  1 1  were  class  2  objects  classified  as  class  I 
objects.  All  of  our  test  data  used  the  same  fixed-decision  threshold 
level.  Use  of  adaptive  thresholds  can  significantly  improve 
performance. 

When  noise  was  present  in  both  the  test  and  training  sets  (all  noise 
is  uncorrelated  between  images),  performance  remained  rather  sta- 
tionary(l2to  14  errors)  for  a  values  up  to  0.2  (tests  l2through  14). 
When  the  training  set  noise  was  0.3  or  larger,  the  number  of  errors 
increased  significantly  (from  14  in  test  14  to  52  in  test  15).  At  this 
noise  level,  the  filters  are  simply  not  valid  representations  of  the 
objects.  Recall  that  the  input  signal-to-noise  ratio  (SNR)  is  different 
for  each  image  since  fewer  object  pixels  exist  and  more  background  is 
present  for  aspect  views  further  away  from  broadside  views.  The 
performance  shown  in  Table  V  is  still  very  impressive  and  should  be 
adequate  for  most  object  identification  applications  over  a  signifi¬ 
cant  range  of  sensor  noise  levels.  The  tests  in  Table  V  were  repeated 
for  six  training  set  images  per  class,  chosen  at  evenly  spaced  intervals 
of  about  every  50°,  and  comparable  results  were  obtained.  The  tests 
in  Table  V  were  also  repeated  for  the  case  when  noise  was  present 
only  in  the  background  of  the  object  (not  on  the  object).  Comparable 
results  were  obtained,  with  the  number  of  errors  changing  slightly 
less  dramatically  (since  the  equivalent  input  SNR  is  better  for  a  given 
a  of  the  noise  when  noise  is  present  only  in  the  background  of  the 
object,  rather  than  on  the  object  and  in  the  background). 

As  our  final  noise  performance  test,  the  K-tuple  SDFs  in  test  6 
(using  the  projection  values  in  Table  IV  rather  than  Table  I)  were 
used  with  varying  amounts  of  noise  added  to  the  background  only 
(similar  results  were  obtained  when  noise  was  present  in  the  back¬ 
ground  and  on  the  target)  in  both  the  test  data  and  the  training  data. 
The  results  (Table  VI)  show  the  excellent  performance  expected  of  a 
correlator  in  the  presence  of  noise. 

6.  SUMMARY  AND  CONCLUSIONS 

The  advantages  of  a  correlator  (processing  gain,  good  performance 
in  noise,  shift  invariance,  or  multiple  object  recognition)  can  be 
retained  and  the  disadvantages  (sensitivity  to  geometrical  distor¬ 
tions)  can  be  overcome  by  synthesizing  the  matched  spatial  filter 
from  SDFs.  These  SDFs  are  linear  combinations  of  the  training  sets 


TABLE  V.  K-tupla  SDF  Performance  (Tabla  I  Projection  Valuat)  ht  the  Presence  of  Noise  ht  tha  Training  and/or  Taat  Data* 


Test 

Noise  standard 
deviation  (o) 

Number  of  errors  per  class 

Total 

Number 

correctly 

classified 

Percent 

correctly 

classified 

number 

Training 

Testing 

1 

2 

3 

4 

of  errors 

7 

00 

0.0 

0(0) 

2(0) 

0(0) 

10(2) 

12 

130 

903 

8 

0.0 

0.2 

0(0) 

KD 

0(0) 

8(4) 

9 

130 

90.3 

9 

0.0 

03 

0(0) 

KD 

0(0) 

2(3) 

3 

137 

95  1 

10 

00 

0.4 

0(0) 

3(9) 

0(3) 

0(0) 

3 

129 

896 

11 

00 

06 

0(0) 

27(9) 

36(0) 

0(0) 

63 

72 

500 

12 

0.0 

0.0 

0(0) 

2(0) 

0(0) 

10(2) 

12 

130 

903 

13 

0.2 

0.0 

0(0) 

1(1) 

0(0) 

13(2) 

14 

127 

882 

14 

0.2 

0.2 

0(0) 

2(0) 

0(0) 

12(2) 

14 

128 

889 

16 

0.3 

02 

17(1) 

9(1) 

1(0) 

26(2) 

62 

88 

61.1 

'No-decision  cim  with  protection  values  within  ±003  of  th«  threshold  ere  noted  m  parentheses,  noise  everywhere,  hull  not  present 
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TABLE  VI.  K-tupte  SDF  Pgrfonngwcg (Tibh  IV  Projection  Values)  in  the  Prasanca  of  Noiaa  in  the  Training  and/or  Test  Data* 


Noise  standard  _  w 

Test  d*viallon(a|  Number  of  errors  par  class  ^mber  co“S 

number  Training  Testing  1  2  3  4  of  errors  classified 


Percent 

correctly 

classified 


*No-deciawn  c 


i  with  pro»action  values  within  ±0.03  of  tha  threshold  era  t 


of  imagery  per  object  class.  A  unified  formulation  for  four  different 
types  of  SDFs  has  been  reviewed,  and  their  performance  with  and 
without  noise  has  been  quantified.  Achieving  optimum  performance 
for  such  filters  is  a  complicated  data-dependent  function  of  the 
training  set  images  used  and  the  projection  values  selected.  Initial 
guidelines  for  such  selections  were  advanced  and  verified  by  simula¬ 
tion  experiments. 
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ABSTRACT 

A  new  class  of  discriminant  filter  functions  for  use  in  a  matched  filter  correlator  for 
multi-class  distortion-invariant  pattern  recognition  is  described.  Three  variations  of  these 
optimal  linear  discriminant  functions  (OLDFs)  that  optimize  different  performance  measures 
are  described  and  initial  performance  results  are  presented. 

1 .  INTRODUCTION 


Correlators  represent  a  powerful  class  of  pattern  recognition  architectures  that  allow 
multiple  targets  to  be  located  and  that  provide  excellent  performance  in  noise.  Optical 
systems  [1,2]  easily  achieve  the  correlation  operation  in  real-time  and  various  compact  ver¬ 
sions  of  such  systems  have  been  fabricated  [3]  and  discussed  [4],  Co;'  relators  are  well  known 
to  be  quite  susceptible  to  geometrical  distortions  between  the  input  and  reference  object. 

A  most  attractive  technique  to  achieve  distortion-invariant  correlation  uses  synthetic  dis¬ 
criminant  functions  (SDFs)  [5],  projection  SDFs  [6,7],  correlation  SDFs  [8],  or  related 
methods  [9,10].  In  general,  these  prior  techniques  achieved  filter  synthesis  by  forcing 
fixed  projection  values  for  training  set  images  in  two  classes  [5,7,9].  In  [8],  a  least- 
squares  solution  and  a  solution  that  maximized  the  peak-to-sidelobe  ratio  plus  the  class  1 
to  class  2  outputs  was  employed.  In  [11],  class  2  was  treated  as  noise  and  SNR  was  maxi¬ 
mized  . 

In  this  paper,  we  briefly  review  the  5  standard  SDFs  using  a  new  notation  (Section  2) . 

We  then  describe  three  new  optimal  linear  discriminant  function  (OLDF)  filters  that  maximize 
different  performance  measures  (Section  3) .  These  filters  differ  from  the  original  SDF  and 
other  work  in  that  they  are  optimal  (i.e.  maximize  various  performance  measures  useful  in 
discrimination  pattern  recognition) .  They  are  thus  preferable,  since  predicted  Pp,  Pp^  and 
Pe  performance  and  noise  effects  on  them  should  be  able  to  be  analyzed  (theoretically  and 
statistically)  more  easily.  Initial  simulation  results,  using  only  correlation  plane  pro¬ 
jection  values  are  advanced  (Section  4)  to  demonstrate  and  quantify  the  intra-class  recogni¬ 
tion  and  inter-class  discrimination  performance  of  these  OLDFs  for  multi-class  cases  and  m 
the  presence  of  noise. 


2. _ SDFs 


The  5  standard  SDFs  [5]  are  now  reviewed  for  background  and  to  introduce  the  new  notation 
that  is  most  appropriate  for  description  of  our  OLDFs  in  Section  3.  The  inner  product  of 
vectors  x  and  y  is  defined  as  <x,^>  =  Ei=iXiyi'  where  the  N  elements  of  x  and  are  denoted 
by  xx  and  y^  and  x  =  (xi’-'X^j)^.  The  norm  ixj  [  of  x  is  defined  by  ;  j x 1 T 2  =  <x,x-'.  Consider 
four  classes  of  objects  with  training  set  images 


{Ii}i=i'  {Ji}i=i-  {Vi=i  and  {Vi=i  - 

where  there  are  n  images  in  (It  (class  1) ,  etc. 

An  intra-class  SDF  (equal  correlation  peak  SDF)  F  is  defined  such  that 


(1) 


<1^ , F  '  =  1  for  i  =  1  •  • • n . 

A  two-class  mutual  orthogonal  function  (MOF)  SDF  is  defined  such  that 


*-  ~~  -*  -*■  "r  '-*  '-A  'i  *-«  .V  A  Mi  ,y  . y  .I  ."j,  . .iv'-vhv'  v  .v 


(2) 


0  for  1 


1  •  •  •  n 


(3) 


<I1'F  = 

^ , F  >  =  1  for  i  =  l*--m, 

with  other  projection  values  possible  in  (3) .  Two  MOF  SDFs  Fj  and  F2  to  recognize  all  ver¬ 
sions  of  class  1  (I1)  and  reject  all  versions  of  class  2  (J^)  and  vice-versa  for  F2  are  de¬ 
fined  such  that 


*■  1^  <  Fj  >  =  0  and  <  1^  ,F 2 >  =  1  for  i  =  1 
<Ji'Fl  =  *  an<3  <Ji'"2>  =  ®  F°r  i  =  1’ 


(4) 


A  multi-level  SDF  for  4  classes  results  by  choosing  different  vector-inner  product  forcm 
constants,  i.e.  for  the  4  classes  in  (1),  we  can  require 


<F,I. >  =  0,  <F,J. >  = 
1  1 

1 , 

<F,Ki>  = 

=  2,  <F,L. '  =  3. 

1 

(5) 

SDF  for  M 

classes  uses  K  SDFs  (where 

2K  >  M) 

with  binary  valued  projection 

forcm 

For  the 

4  classes  in  (1),  we  require  K  =  2 

SDFs  (Fj  and  F2)  defined  such 

that 

<F1'V  = 

0, 

<F2'Ii> 

=  0 

<Fl,Ji>  = 

0, 

<F2'V 

=  1 

(6) 

<F1'Y  = 

1, 

<F2  ,K.i  > 

=  0 
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*— » 

V* 

H 

V 

II 

1, 

<F2'V 

=  1. 

To  obtain  unique  solutions  F  for  these  SDFs,  we  require  them  to  lie  in  the  subspace 
spanned  by  the  training  sets.  This  solution  has  the  additional  advantage  that  the  projec¬ 
tions  of  other  objects  not  considered  (for  classification)  will  be  minimized.  Thus,  F^  etc 
are  linear  combinations  of  the  4  training  sets,  e.g. 
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Using  (7a) ,  we  write  the  vector-inner  product  of  F^  and  Ij  as 
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(8) 


Similar  expressions  can  be  obtained  for  the  other  vector  inner  products  in  (2) -(6).  The  ax 
and  bx  coefficients  that  define  F3  and  F2 ,  for  the  SDF  in  (6),  can  be  obtained  by  solving 
the  following  system  of  equations 
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In  (9),  superscript  t  denotes  transpose  and  (  )t(  )  denotes  a  vector-inner  product.  If  a 
unique  solution  does  not  exist,  the  least-squares  solution  (obtained  by  computing  the  gener¬ 
alized  inverse  of  the  matrix)  is  used.  As  noted  at  the  outset,  these  SDFs  are  computed  using 
fixed  projection  values  for  the  various  training  set  classes. 

3 ■  OLDFs 

We  can  describe  the  new  OLDFs  as  linear  functionals  f  on  the  finite  dimensional  vector 
space  of  images.  From  the  Riesz  representation  theorem  [12],  we  can  also  describe  these 
OLDFs  by  a  discriminant  vector  u,  where  f  (x)  =  <u,x>  for  all  x  in  our  real  linear  vector 
space.  The  vector-inner  product  of  two  functionals  fj  and  f^  in  the  dual  vector  space  and 
their  corresponding  uj  and  U2  OLDFs  are  related  simply  by  =  <H1'H2^-  our 

OLDFs,  we  consider  only  two-class  problems  {Ij}  and  {J^}  with  n  and  m  training  set  images 
respectively . 

3.1  OLDF-1 

As  our  first  OLDF-1,  we  consider  a  version  of  the  MOF  SDF  in  which  the  projection  for  one- 
input  class  is  0  whereas  the  projection  for  the  other  class  is  maximized  (rather  than 
being  a  fixed  constant  value  of  1).  Three  types  of  maximizations  were  considered  (corres¬ 
ponding  to  3  cases  (A,B,C)  for  OLDF-1).  These  are  defined  as  finding  OLDF-1  u,  such  that 


CASE  A:  <u,J  >  =  0  for  i  =  1  •  • -m 

-  l 


-u,  I  > 
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max  _  .  , 

all  x ' <x'Ii> !  ^  i  =  1. 


<u, J.>  =  0  for  i  =  1  - 


i  <u,i . > ;  = 
i  =  l  x 


<x , I . >  i 


CASE  C:  <u,Ji>  =  0  for  i  =  1  •  • -m  (12a) 

I  <U,I  >2  =  I  <x,I  >2  .  (12b) 

i=l  1  aii  x  i=l  1 

In  all  cases,  ;  ju|  '•  *  1  and  !  |x|  |  =  1  (i.e.  we  describe  formulation  for  normalized  image  and 
discriminant  vectors) .  This  is  necessary  to  insure  that  physically  large  objects  do  not 
dominate  the  filter.  Of  course,  all  testing  is  performed  on  unnormalized  images. 


Let  us  discuss  the  3  cases  in  (10)— (12) .  In  (10),  we  force  the  projections  for  one  class 
[ J i i  to  0  and  maximize  the  absolute  value  of  each  of  the  projection  values  of  all  vector 
images  in  the  other  class  {Ij_}.  There  is  no  general  solution  to  case  A  [13].  In  case  B, 
we  maximize  the  sum  of  the  absolute  values  of  the  projections  for  the  first  class  of  images 
(Ii).  In  case  C,  we  maximize  the  sum  of  the  squares  of  the  projections  on  {Ij}.  Case  C  is  an  analyti¬ 
cally  simpler  optimization  problem.  Thus,  we  form  OIDF-1  using  (12) .  In  (12) ,  u  is  the  x  for  which  |<x,I>  is 


a  maximum 


To  solve  (12),  we  first  denote  the  subspace  spanned  by  the  {Jji  as  Sp(Jy}y_-j  (where  m  is 
the  number  of  vectors)  and  the  subspace  for  (If)  by  Sp{Ii}^_j.  In  total,  there  are  N  =  m+n 
training  set  vectors  and  a  maximum  of  N  basis  functions  Tin;  for  this  data.  We  proceed  to 
form  a  maximal  orthonormal  set  {fi)^=i  from  {Ji>,  where  m’  <.  m.  Next,  we  look  at  the, re-; 
maining  [  4  orthogonal  elements  in  our  space.  We  form, a  set  n'  of  these  { C  j  }  g'sm^  +  l 

(where  n'  £  n)  thap  spans  Spil^:  and  is  orthogonal  to  Sp-'Cyi^i-  Oui  OLDF-1  u  is  now  an 
element  of  Sp{  l  j  •  We  thus  define 


I*  =  Z  <-  .  , 
1  j=m’+l  x 


as  a  weighted  sum  (with  weights  given  by  the  vector-inner  product  'Ij,;j>)  of  the  : j  (which 
are  orthogonal  to  the  ii)-  The  optimization  problem  in  (12a)  and  (12b)  thus  becomes:  find 
u,  such  that 


<x,I’ >  . 


7WW.4 


We  rewrite  (1-3)  as 
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The  solution  u  to  (15)  that  defines  OLDF-1  is 


u  =  Dominant  eigenvector  of  Rj  , 


T 

RT ,  =  I  I)JI!  =  Correlation  matrix  for  { I !  t  .  ( 

1  i=l  1  1  1 

We  note  that  for  n1  =  1,  (16)  solves  (10)  and  (11)  also  (13). 

3.2  OLDF-2 

For  OLDF-2,  for  each  Ii,  we  find  the  Ji  image  in  (J^)  that  is  closest  (using  the  norm 
di.  .ance)  to  Ij_ .  In  OLDF-2,  we  maximize  the  sum  (over  all  i  =  1  •  •  >n)  of  the  squares  of 
<u,I1-Ji>,  i.e.  OLDF-2  is  u  such  that 

I=i<u,Ii-Ji>2  =  ^  I=i<x,Ii-Ji>2.  ( 

Following  the  procedure  in  Section  3.1,  the  OLDF-2  solution  u  is 

u  =  Dominant  eigenvector  of  R,  ,  ,  ( 

-  -  -ii"  Ji 


R  _  =1  (I.-J. ) (I.-J. )  (20) 

Ji  Ji  i= i  11  1  1 

is  the  correlation  matrix  of  the  (I^-J^)  vectors,  where  is  the  vector  image  in  {J^i  that 
is  closest  to  1^. 

3.3  OLDF-3 

In  OLDF-2,  we  maximized  the  difference  between  Ii  and  Ji,  the  nearest  neighbor  of  1^.  In 
OLDF-3,  we  maximize  the  difference  between  each  1^  and  all  ,  i.e.  the  overall  total  separa 
tion  between  both  classes.  This  OLDF-3  filter  u  is  defined  by 

n  m  _.  n  m 

*■  —  ~ ,  t  t  -  nisx  ^  -  T  _  l  /-ill 

1  -  <U«I;  ■*  ~  gl]  x  -  ^X,I  -J.  .  (211 

1=1  j=l  1  all  X  i=1  j=1  i  1 

Following  the  procedure  in  Section  3.1,  the  solution  u  to  OLDF-3  is 

u  =  Dominant  eigenvector  of  R  (22) 
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3.4  EXTENSIONS 


In  Section  3,  we  considered  OLDF  formulation  for  two-class  problems.  However,  extensions 
to  multi-class  recognition  can  be  achieved  by  extending  our  OLDFs  using  the  techniques  in 
Section  2.  We  have  described  our  OLDF  solutions  as  the  most  dominant  eigenvectors.  However, 
one  can  retain  the  N  most  dominant  eigenvectors.  The  number  to  be  retained  depends  upon  the 
eigenvalues.  If  the  most  dominant  eigenvalues  are  close,  then  we  can  retain  more  than  one 
eigenvector.  If  the  two  largest  eigenvalues  are  widely  separated,  keeping  the  second  worst 
dominant  eigenvector  will  not  necessarily  improve  performance  since  additional  noise  is  now 
present  in  the  filter.  When  W  filters  are  used,  the  sum  of  the  absolute  value  of  the  pro¬ 
jections  on  each  is  used  to  compare  to  a  threshold  (set  from  training  set  data  for  each 
class)  . 

4.  INITIAL  TEST  RESULTS 

4 . 1  DATABASE 

To  test  the  performance  of  our  OLDFs,  we  used  two  classes  (two  ships:  the  Moskva,  a 
Soviet  helicopter  cruiser,  and  the  Leahy,  a  U.S.  guided-missile  cruiser).  Each  ship  was 
binarized  with  128  x  32  pixels.  For  each  ship,  36  views  at  a  90°  depression  angle  (0° 
attack  angle)  were  avai1  ole  (every  10°  around  the  ship).  The  bow  is  numbered  as  image  1, 
broadside  as  9  and  the  -*tern  as  18,  etc.  For  each  object  class,  various  sets  of  6  images 
were  used  for  the  training  set.  The  OLDF  was  then  tested  against  all  72  images  in  the  two 
classes  (including  the  60  images  in  both  classes  that  the  system  had  never  seen) .  In 
Figure  1,  there  are  about  2000  pixels  on  the  broadside  ship  views  and  200  pixels  on  the  bow 
and  stern  views.  In  our  tests,  we  also  included  noise  (in  both  the  training  and  test  set) 
with  on  listed  (SNR  is  different  for  each  ship  aspect  view  due  to  the  different  number  of 
pixels  on  each  aspect  view).  In  Figure  2,  we  show  the  views  of  ship  class  1  with  an  =  0.3 
and  an  =  0.4  of  noise  added. 


(b)  Class  2  (Leahy)  (b)  Class  1  ship  (on=0.4  noise) 


FIGURE  1  FIGURE  2 

Broadside  views  of  the  two  ships  Broadside  view  of  ship  class  1 

with  different  on  of  noise  added 

4.2  INITIAL  TEST  RESULTS 

Table  1  shows  the  noise-free  performance  obtained  using  the  three  OLDFs  with  6  training 
set  images  per  class.  In  general,  excellent  results  are  obtained,  with  no  more  than  4  mis- 
classifications  out  of  72  images.  In  Test  1,  the  hull  was  present.  Classification  is 
better  without  the  hull  present  (Tests  2-4),  since  the  ship's  superstructure  gives  good 
discrimination  and  the  hull  is  in  general  common  data.  For  the  hull  present,  two  linear 
functionals  had  to  be  used  to  maintain  good  performance  (thus  verifying  the  above  remarks) . 
Test  3  performs  worse  than  Test  2,  since  maximum  separation  from  one  nearest  image  in  the 
second  class  is  not  enough.  Test  4  performs  better  than  Test  3  as  expected  since  differences 
from  all  images  in  the  two  classes  was  maximized. 

The  performance  of  all  of  the  various  OLDFs  in  the  presence  of  noise  in  the  training  and 
testing  sets  and  in  both  were  quantified.  The  results  for  OLDF-1  are  shown  in  Table  2.  The 


1 


results  for  the  other  OLDFs  are  similar.  The  standard  deviation  on  of  the  noise  is  listed 
also.  As  seen,  performance  is  excellent  in  the  presence  of  noise  and  generally  decreases  as 
on  increases  as  expected.  From  Figure  2  and  the  amount  of  deterioration  present  in  the 
images  with  on  =  0.3  and  0.4  noise  levels,  the  initial  performance  of  these  OLDFs  is  quite 
attractive . 


TABLE  1 

Test  results  using  various  OLDFS  (6  training  set  images/class,  binary  images) 
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TABLE  2 

Noise  performance  of  OLDFs 

(6  training  images/class,  binary  images,  one  functional) 
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5 .  SUMMARY 


New  distortion-invariant  correlator  filters  have  been  described  that  maximize  various 
discriminant  pattern  recognition  measures.  The  theoretical  basis  and  ease  of  analysis  for 
these  new  OLDFs  is  attractive.  Initial  experimental  results  are  excellent  and  noise  per¬ 
formance  is  robust.  Full  correlation  tests  and  further  experiments  are  needed  to  assess 
OLDFs  more  fully. 
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ABSTRACT 


The  efficient  representation  and  synthesis  of  3-D  object  information  using  new  synthetic 
discriminant  functions  (SDFs)  is  discussed.  The  use  of  SDFs  in  a  correlator  for  shift-in¬ 
variant  and  distortion-invariant  discrimination  of  3-D  objects  is  detailed  and  experimental 
data  is  provided.  The  new  SDFs  described  control  the  peak  intensity  and  the  structure  and 
statistics  of  the  correlation  plane  pattern. 

1 .  INTRODUCTION 


Correlators  represent  one  of  the  most  powerful  techniques  for  automatic  target  recognition 
(ATR) .  These  systems  allow  multiple  objects  to  be  recognized  in  parallel  (by  the  shift- 
invariant  property  of  a  correlator)  in  the  presence  of  noise  and  structured  clutter  (due  to 
the  processing  gain  achieved  by  a  correlator) .  The  realization  of  correlators  using  coherent 
optical  systems  is  obvious  [1,2]  and  small  size  and  weight  real-time  coherent  optical  cor¬ 
relators  now  exist  [3] .  Advanced  VHSIC  chips  and  architectures  may  also  allow  on-line  cor¬ 
relations  to  be  implemented  digitally.  The  major  shortcomings  of  any  correlator  has  been 
their  po^ r  performance  in  the  face  of  geometrical  distortions  between  the  input  image  and 
reference  object  from  which  a  matched  spatial  filter  (MSF)  is  formed  [4] .  Recently,  advanced 
MSF  synthesis  techniques  have  been  detailed  [5]  and  demonstrated  on  ship  imagery  [6],  These 
new  MSF  synthesis  algorithms  form  the  MSF  from  a  training  set  of  images  of  different  target 
objects  from  different  aspects,  scales,  rotations,  etc.  These  new  filter  functions  are 

referred  to  as  synthetic  discriminant  functions  (SDFs) . 

A  brief  review  of  conventional  SDFs  is  provided  in  Section  2  with  discussion  on  their  use 
in  the  representation  of  3-D  object  information.  Three  new  types  of  SDFs  that  control  the 
shape  of  a  correlation  plane  pattern  are  then  detailed  in  Section  3  with  discussion  on  their 
representation  of  3-D  object  data.  Initial  experimental  results  are  then  advanced  in  Sec¬ 
tion  4  using  a  new  database  of  tank,  armored  personnel  carrier  (APC)  and  similar  military 
ATR  objects. 


2.  THE  SDF-BASED  CORRELATOR  CONCEPT 


To  achieve  intra-class  recognition  of  different  distorted  versions  of  a  3-D  input  ATR  ob¬ 
ject  using  a  correlator,  the  MSF  h(x,y)  can  be  formed  from  a  linear  combination  of  training 
set  images  (fn)  that  are  different  3-D  distorted  views  of  the  target  object,  i.e. 

h(x,y)  -  I  a  f  (x,y) .  (1) 

n  n  n 

If  we  restrict  the  correlation  peak  value  to  be  unity  for  fn  ,  then  the  SDF  MSF  in  (1)  is 
defined  by 


a  =  R_1u  (2) 

where  the  elements  of  the  vector  a  define  the  linear  combination  coefficients  an,  u  is  the 
unit  vector  (this  forces  all  correlation  plane  values  to  be  unity) ,  and  R  is  the  correlation 
matrix  of  the  {fn}  training  set  imagery.  The  SDF  in  (1)  -  (2)  achieves  Tntra-class  recogni¬ 
tion.  To  obtain  inter-class  discrimination  while  still  retaining  intra-class  recognition, 
the  training  set  is  expanded  to  include  sets  of  the  distorted  objects  (fj)  and  { f 2 '  in  two 
or  more  classes.  A  single  SDF  or  several  SDFs  that  are  linear  combinations  of  all  of  the 
training  set  imagery  can  then  be  formed.  The  filter  synthesis  procedure  is  similar  to  that 
in  (1)  and  (2)  with  larger  R  matrices  (for  several  object  classes)  and  different  exogenous 
vectors  u  as  detailed  elsewKere  [5] .  The  object  class  can  be  determined  from  the  value  of 
the  correlation  peak  or  from  combinations  of  different  filter  output  values. 

These  initial  SDFs  [5]  have  performed  well  in  tests  on  various  image  databases  [6]  pri¬ 
marily  on  ship  imagery.  In  this  paper,  we  consider  other  ATR  targets  (tanks  and  APCs)  and 
we  extend  the  original  SDF  concept  to  include  control  of  the  shape  of  the  correlation  plane 
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pattern.  The  original  SDFs  only  control  the  value  at  one  point  in  the  correlation  and  thus 
we  refer  to  these  as  projection  SDFs.  For  ATR  using  the  original  SDFs,  a  correlation  plane 
threshold  is  set  (determined  by  the  filter  synthesis  algorithm)  and  from  the  locations  and 
peak  values  of  the  regions  of  one  or  several  correlation  planes  that  exceed  the  threshold, 
the  object  class  and  object  location  in  the  input  field-of-view  can  be  determined.  This 
technique  is  susceptible  to  variations  in  the  modulation  level  of  the  input  data  (since  the 
correlation  value  varies  linearly  with  the  modulation  of  the  input  object) .  From  the  dc 
value  of  the  input  Fourier  transform  (FT)  plane  pattern,  the  output  threshold  can  be  adjusted 
In  Section  2,  we  detail  three  new  SDFs  (correlation  SDFs)  that  automatically  control  the 
shape  of  the  true  and  false  correlation  plane  locations  and  thus  facilitate  correlation 
plane  analysis  by  the  combination  of  threshold  detection  and  correlation  plane  and  peak 
analysis . 


OBJECT  CLASS (es) 

OBJECT  LOCATION (s) 

ESTIMATE (s) 
CONFIDENCE 


FIGURE  1 

Block  diagram  of  an  SDF-based  correlator 

The  full  correlation  system  (Figure  1)  thus  distributes  the  processing  and  recognition 
load  between  the  filter  synthesis,  the  correlator  and  the  output  plane  detector.  From  the 
dc  value  of  the  FT  of  the  input,  an  estimate  of  the  input  modulation  is  obtained  and  used 
to  adjust  the  input  intensity  and  the  correlation  plane  threshold.  The  system's  outputs  pro¬ 
vide  estimates  of  the  object  class  and  location  of  all  objects  in  the  input  field-of-view 
and  the  confidence  of  these  estimates. 


.  CORRELATION  SDF  SYNTHESIS  FOR  CORRELATION  SHAPE  CONTROL 


To  control  the  shape  of  the  correlation  peak  for  a  true  target  and  to  insure  suppression 
of  large  correlation  plane  peaks  for  shifted  versions  of  false  target,  we  expand  the  train¬ 
ing  set  of  images  to  include  Ns  shifted  versions  of  each  object.  To  describe  the  filter 
synthesis,  we  consider  a  two-class  pattern  recognition  problem  with  Nj  and  N2  training  set 
images  { f }  and  { g )  per  class  with  Ng  shifted  versions  of  each  training  set  image,  i.e.  a 
total  of  N?  ■  Ng(Nj  +  N2)  training  set  images.  The  SDF  synthesis  algorithm  in  (2)  restricts 
only  the  vector  inner  product  or  the  vector  projection  of  each  object  f^  or  3.^  onto  the 
filter  function  h,  i.e.  only  the  central  correlation  plane  value. 

SDF-1  (Exact  Correlation  SDF) 


For  the  one-filter  two-class  pattern  recognition  problem,  the  new  SDF  is  defined  to 
satisfy 

h  •  fi  *  1  (central  correlation  value  of  1  for  true  targets)  (3a) 

h  .  f£  =  0  (0  correlation  value  away  from  peak)  (3b) 

h  •  3^  =  0  (0  correlation  value  away  from  peak)  (3c) 

h  .  3i  «  0  (central  correlation  value  of  0  for  false  targets)  (3d) 

where  the  notation  is  defined  in  Table  1.  Eqs.(3a)  and  (3d)  are  similar  to  the  original  SDF 
requirements  (a  central  correlation  peak  value  of  unity  for  true  targets  in  class  one  and 
2ero  projections  for  false  class  two  targets).  Eqs.(3b)  and  (3c)  are  the  new  restrictions 


( 


object  in  the  class  to  be  recognized  =  number  of  training  set  images  { f }  in  class  1 

shifted  version  of  ft  N2  =  number  of  training  set  images  { g }  in  class  2 

object  in  second-class  to  be  rejected  Ng-1  =  number  of  shifted  versions  of  each  image 
shifted  version  of  NT  =  Ngd^+N^)  =  total  training  set  size 

aunount  of  shift  (in  pixels)  for  (  ) ’  h  =  MSF  SDF  filter  function 


The  filter  h(x,y)  satisfying  (3)  will  thus  have  a  correlation  plane  output  for  a  true 
class-one  target  with  a  fixed  peak  value  of  unity  and  a  fixed  zero-value  ds  pixels  away  (in 
+x  and  ±y)  and  thus  a  well-defined  correlation  plane  peak  shape.  The  controlled  correla¬ 
tion  peak  value  can  allow  the  use  of  a  fixed  correlation  plane  threshold  Tc  of  0.5. 

This  selects  regions  of  potential  interest  in  the  field-of-view.  For  each  output  plane  region 
of  interest  exceeding  the  threshold  Tq,  the  peak/mean  =  C  ratio  is  computed.  This  new 
classification  measure  C  applied  to  those  correlation  plane  regions  exceeding  Tc  allows 
significantly  better  system  performance.  To  calculate  C,  the  mean  is  computed  over  the 
(2ds+l)  x  (2ds+l)  pixel  region  around  each  peak  of  interest  (the  mean  computation  excludes 
the  central  peak  value).  In  our  specific  work,  the  parameters  in  Table  2  are  used.  Many 
other  variations  of  this  basic  algorithm  are  possible  such  as: 

(1)  applying  the  C  threshold  to  the  regions  of  the  output  with  the  largest  peak  values 

only  or  just  the  largest  peak  location  (if  only  one  object  is  known  to  be  present) : 

(2)  extension  of  the  filter  in  Eq. (3)  to  the  five  different  projection  SDFs  [5] ; 

(3)  extension  of  the  requirements  in  Eq. (3)  to  include  more  shifted  versions  of  each 

input  image; 

(4)  application  of  a  weighted  spatial  taper  to  the  FT  of  the  SDF  to  suppress  its  side- 
lobe  response;  and 

(5)  modification  of  the  mean  in  C  to  include  only  those  correlation  plane  values  at  the 
specific  ds  pixel  shifts  from  the  peak  value; 

(6)  use  of  input  modulation  estimates  to  adjust  the  Tc  threshold. 

TABLE  2 

Specific  filter  parameters  used 


Nx  =  N2  =  6 

N  «  5  (centered  and  4  shifted) 
s 

ds  =  5  pixels 

Nt  *  5(6+6)  =  60 

To  synthesize  the  SDF  h  satisfying  (3),  we  restrict  h(x,y)  to  be  a  linear  combination  of 
all  training  set  images 


h(x,y)  =  I  a..f.(x,y)  +  l  a,_g. (x,y)  +  I 

Ni  11  i  N2  (Ns-1)Ni 


ai3fi(x,y)  + 


z  al49’<x-y> 
<NS-1)N2 


where  the  number  of  images  in  each  summation  is  indicated  under  the  associated  Z.  Denoting 
the  full  Nt  set  of  training  set  images  by  (z)  and  individual  images  by  zn,  the  filter  h  in 
(4)  is  defined  by  the  coefficient  vector  a  that  solves  the  equation 


R  a  =  Hi .  Hi 


where  F  is  the  full  Hj  x  Nt  correlation  matrix  of  all  { z }  training  set  images.  The  choice 
of  ui  in  (5)  satisfies  the  requirements  in  (3).  Eq.(5)  is  a  simple  extension  of  (2).  The 

SDF  in  (3)  is  referred  to  as  a  correlation  SDF  and  the  specific  SDF  solution  in  (5)  is  denoted 
as  SDF- 1  or  the  exact  correlation  SDF ■  This  terminology  refers  to  the  fact  that  the  solution  in 
(5)  is  an  exact  solution  to  (3). 
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2.2  SDF-2  (Least-Square  Correlation  SDF) 

The  solution  a  in  (5)  requires  solving  the  Nt  linear  algebraic  equations  (LAEs)  defined 
by  (5).  As  the  nature  of  the  problem  increases,  so  will  the  dimensionality  Nt  of  R  and  com¬ 
putational  problems  plus  ill-conditioned  matrices  may  arise  (even  though  the  filter  synthesis 
is  performed  off-line) .  The  typical  solution  to  such  a  problem  is  to  reduce  the  dimension¬ 
ality  of  R,  i.e.  to  reduce  the  number  of  training  set  images.  However,  a  reduction  in  Nj  or 
N2  will  degrade  the  3-D  object  information  on  each  target  class  and  a  reduction  in  Ns  will 
degrade  the  correlation  shape.  The  SDF  synthesis  in  (4)  and  (5)  is  equivalent  to  describing 
each  object  as  a  d-dimensional  vector  in  hyperspace  where  d  =  Nt-  In  these  terms,  our 
second  realization  of  SDF-1  retains  all  Nt  training  set  images  but  reduces  d  to  D,  such 
that  D  <  Nt.  This  reduction  of  the  dimensionality  of  our  hyperspace  rather  than  the  number 
of  training  set  images  is  both  practical,  preferable  and  new.  To  select  the  D  basis  func¬ 
tions  4<i{x,y)  we  use  the  well-known  Karhunen-Loeve  (KL)  technique  [7].  For  each  unshifted 
set  object,  and  for  each  shifted  version  of  each  training  set  object,  we  compute  the  dominant 
KL  eigenvectors  of  the  associated  correlation  matrix.  In  our  experiments,  we  retain  three 
dominant  KL  eigenvectors  per  class  (as  noted  above) .  Efficient  methods  of  computing  the 
dominant  KL  eigenvectors  of  a  large  matrix  and  a  large  database  were  noted  earlier  [8). 

For  the  case  of  (Ns-1)  *  4  shifted  versions  of  each  image  (five  shifted  images,  including 
the  central  centered  image) ,  the  three  dominant  KL  eigenvectors  of  each  of  the  ten  correla¬ 
tion  matrices  R  (the  matrices  for  the  original  object  f,  the  false  target  and  each  of  the 
four  sets  of  shifted  images  per  class,  with  two  shifts  in  both  x  and  y)  were  computed  and 
retained.  This  provides  a  new  D  =  30  (rather  than  d  =  5  x  12  =  60)  basis  function  set.  This 

new  {£' }  basis  function  set  thus  represents  all  of  the  3-D  information  in  the  training  set 

imagery  for  the  two  targets.  Retaining  more  than  three  KL  eigenvectors  per  class  improves 
the  accuracy  of  this  approximate  algorithm  (at  the  expense  of  increased  off-line  computation¬ 
al  complexity).  As  noted  in  [11],  three  eigenvectors  are  generally  adequate  to  represent 

over  90%  of  the  3-D  object  information.  This  basis  function  set  was  then  converted  to 

the  orthonormal  basis  function  set  {$}  using  a  Gram-Schmidt  (GS)  orthogonalization  tech¬ 
nique  [12]  . 

In  terms  of  these  new  basis  functions,  we  describe  the  desired  filter  function  as  the 
linear  combination  filter 


D 

h(x,y)  =  l  b.<|>.(x,y),  (6) 

d=l  a  a 

Each  input  image  in  (z)  can  then  be  described  as  a  linear  combination  of  the  basis  functions 
as 


zn  <x,y) 


D 

I  c 
d=l 


nd*d(x'y> 


(7) 


where  n^varies  from  1  to  Nt  (i.e.  over  the  full  training  set  of  NT  images)  and  Cjj  is  of  size 

as  vectors  z^a  of  lengtK  D  (with 
the  D  vectors  4,j(x,y).  According  to 


(8) 


(9) 

where  C  contains  Nt  rows  and  D  columns.  Fince  Nt  >  D,  the  classic  least-square  solution  is 
used  to  determine  h(x,y)  as 

b  =  (CTC]':1CTu1,  (10) 

T 

where  the  size  of  C  C  is  now  D  x  D.  Other  optical  solutions  to  such  an  overdetermined  least 
squares  problem  [9,10]  were  considered  for  the  case  D  >  NT-  These  solutions  have  not  been 
correctly  formulated.  Specifically,  if  D  >  Nt,  no  unique  solution  exists,  since  the  [CTC] 
matrix  is  not  full  and  thus  cannot  be  inverted.  Thus,  a  least  squares  solution  is  not 
appropriate . 


N  x  N  =  N  .  Hereafter  all  zn(x,y)  images  are  represented 
their  D  elements  equal  to  the  projections  of  zn(x,y)  on 
(3) ,  we  require 


In  matrix-vector  form,  we  write  (6)  and  (8)  as 


T  v 
— nd— 

=  1 

if  z  ,  *  f. 
— nd  — i 

T  v, 
— nd— 

=  0 

if  z  .  #  f . 
— nd  — i 

and 

(8) 

as 

C  b  =  u,  , 


.v . 


For  our  case,  N>j  >  D  and  a  solution  exists.  We  recall  that 
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where  z^  is  of  length  D,  the  matrix  multiplication  in  (11)  is  formed  as  a  vector-outer-prod¬ 
uct  sum,  and  R  in  (11)  is  the  D  x  D  correlation  matrix  of  reduced  dimension  D.  We  also  note  that 

T  NT 

C  u.  =  r  u  7  .,  (12) 

-  _1  n=l  ^nd 


where  the  matrix-vector  product  in  (12)  has  been  written  as  a  vector-inner  product  sum  over  each  vec¬ 
tor  element  z^  weighted  by  the  elements  un  of  u\ .  Using  (11)  and  (12),  Eq.  (9)  becomes 

Nt 

ntk  b  =  r  unznd. 
n=l 


The  solution  for  the  SDF  in  terms  of  the  Zj of  length  D  and  the  D  x  D  matrix  R  is 


b 


N-j« 

i1un^nd- 

n=l 


(14) 


From  the  D  elements  of  b,  the  D  vectors  j>^ ,  and  (6),  the  SDF  h(x,y)  is  defined. 

Solving  (14)  involves  the  solution  of  D  rather  than  d  simultaneous  LAEs.  Thus,  the 
solution  (14)  is  computationally  simpler  and  faster,  although  the  final  result  is 
more  approximate.  The  use  of  (14)  lies  primarily  in  its  computational  ease  for  cases  when 
Nt  is  large.  It  is  also  useful  in  cases  when  h  must  be  updated  in  real-time  by  an  on-line 
processor.  This  corresponds  to  a  Kalman  filter  update  of  the  SDF  function.  This  situation 
arises  when  the  projection  value  for  an  input  object  is  near  threshold.  In  this  case,  we 
can  update  the  filter  with  subsequent  views  of  the  input  object  and  thus  improve  its  original 
projection  value.  Such  cases  occur  when  the  scale  of  the  target  or  its  depression  angle,  etc. 
differ  from  that  of  the  training  set  imagesused.  The  solution  in  (14)  minimizes  the  mean 
square  error 


J 


N, 


l 

n=l 


i 

i  » 


(15) 


where  un  *  1  for  z  *  { f ^ }  and  is  zero  otherwise.  Setting  3J/3h  =  0  in  (15)  and  solving  for 
h,  we  obtain  the  coefficient  solution  in  (14).  Since  R  in  (14)  is  of  reduced  dimensionality 
D  =  30  (rather  than  d  «  60) ,  the  solution  in  (14)  is  far  simpler  and  more  accurately  com¬ 
puted  . 

The  least-square  correlation  SDF  filter  solution  h  in  (14)  is  an  approximate  solution  to 
the  exact  filter  function  problem  in  Section  2.1.  Hence,  the  associated  name  for  this  fil¬ 
ter  function  noted  earlier  is  employed.  The  accuracy  of  this  solution  depends  upon  the 
accuracy  to  which  the  several  dominant  KL  eigenvectors  per  correlation  matrix  adequately 
represent  the  data  in  the  full  correlation  matrix.  The  summation  of  the  eigenvalues  associ¬ 
ated  with  these  eigenvectors  quantifies  this  accuracy.  This  SDF-2  filter  function  thus 
attempts  to  select  h  such  that  the  desired  peak  values  are  as  close  to  unity  as  possible  and 
that  the  false  target  peak  values  at  all  shifted  image  correlation  values  are  as  close  to 
zero  as  possible  (in  a  least-square  sense).  A  correlation  peak  threshold  T  =  0.5  can  be 
used  as  before  or  one  can  simply  calculate  C  for  all  correlation  plane  regions  with  large 
peak  values.  Experimental  data  on  these  methods  using  such  a  SDF  are  advanced  in  Section  3. 

2.3  SDF-3  (Generalized  Correlation  SDF) 


SDF-2  is  somewhat  statistical  since  the  h  choice  minimizes  J  in  (8) .  The  final  type  of 
correlation  SDF  is  also  statistical.  Rather  than  selecting  h  to  cause  the  desired  correla¬ 
tion  plane  values  to  be  as  close  to  1.0  and  0.0  as  possible  Tas  in  SDF-1  and  SDF-2),  SDF-3 
is  chosen  to  maximize  the  peak-to-mean  ratio  C.  In  SDF-1  and  SDF-2,  the  C  test  will  provide 
target  discrimination  and  recognition  plus  invariance  to  object  modulation  (C  is  invariant 
to  object  modulation  or  contrast) .  However,  if  the  peak  value  for  the  target  is  not  above 
the  threshold  T,  the  C  test  will  never  be  applied  to  the  proper  correlation  plane  region. 
Assuming  that  maximizing  C  (or  the  correlation  plane  SDF)  maximizes  the  peak 
value  Ip  of  the  correlation,  then  SDF  -  3  will  produce  both  large  peak  values  and 
large  C  values.  Specifically,  the  correlation  plane  regions  with  the  largest  peak  values 
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are  expected  to  include  the  correct  target  objects  and  the  correlation  plane  regions  cor¬ 
responding  to  the  correct  targets  will  also  have  large  C  values  14]. 

The  central  correlation  plane  value  for  an  input  image  vector  is  hT  •  f j .  The  mean- 
square-value  of  this  correlation  point  value  for  all  {f^}  is  E[(hT£i)2)  =  hTR  h ,  where  R  is 
the  correlation  matrix  of  (fj).  Maximizing  the  correlation  plane  SNR  for  correct  targets 
thus  requires  maximization  of 

mean-square  value  of  central  correlation  point  for  { f ^ } 

J  =  mean-square  value  of  central  point  for  if^J,  g^ }  and  TgTT 

T 

liRfh 

—  — Tf  /  (  1 

hT[Rfs+V^s^ 

where  Rfg  is  the  sum  of  the  four  correlation  matrices  Rfj,  Rfj  etc.  for  the  objects  fi 
shifted  in  ix  and  +y  (RgS  is  similarly  defined).  The  solution  that  maximizes  J  in  (16)  is 
the  solution  of 


R,h  =  \  [R.  +  R  +  R  ]  h 
— f-  — fs  — g  — gs  — 


(17) 


\  where  \  is  the  generalized  eigenvalue  of  the  matrices.  The  problem  defined  by  (17)  is  the 

\  well-known  generalized  eigenvalue  problem  and  thus  we  refer  to  the  SDF  h  that  maximizes  the 

SNR  defined  by  J  as  a  generalized  correlation  SDF  (SDF-3) .  The  same  { $>T  orthonormal 
, j  basis  function  set  used  in  the  least-square  SDF  is  again  employed  here  with  each  matrix  in 

jf  (17)  being  D  x  D  =  30  x  30  (for  our  cases) . 
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For  SDF- 1  and  SDF-2,  regions  of  the  correlation  plane  above  T  *  0.5  as  well  as  the  larg¬ 
est  peaks  anywhere  in  the  correlation  plane  are  classified  as  interesting  regions  of  poten¬ 
tial  interest.  For  each  of  these  regions,  we  calculate  the  peak-to-mean  ratio 

C  =  central  peak  intensity  „ 

mean  in  11  x  il  window' 

where  an  11  x  11  =  (2ds+l)x(2ds+l)  window  was  chosen  to  agree  with  the  ds  =  5  pixel  image 
shifts  used  in  our  data  and  where  the  correlation  peak  value  is  not  included  in  the  mean 
calculation.  For  each  potential  region  of  interest,  C  is  compared  to  a  threshold  deter¬ 
mined  from  the  C  values  calculated  for  the  Nj  and  N2  centered  training  set  images.  Since 
SDF-3  does  not  fix  a  correlation  plane  peak  intensity,  the  largest  correlation  plane  peaks 
are  selected,  C  is  calculated  for  these  points  and  compared  to  Ct. 

3.  INITIAL  EXPERIMENTAL  RESULTS 


3.1  Database 


The  ATR  data  base  used  in  our  initial  experimental  results  reported  herein  consisted  of 
tv ree  different  objects  (two  tanks  denoted  as  tank  1  and  tank2  and  an  APC) .  High  resolution 
i  ages  of  these  objects  were  obtained  and  decimated  to  produce  56  x  22  target  pixel  images 
typical  of  data  from  a  FLIR  at  the  typical  ATR  acquisition  range  of  interest.  For  each 
object,  36  images  from  a  20°  depression  angle  were  available  at  10°  aspect  intervals.  The 
pixel  values  of  the  images  varied  from  0  to  255  with  most  target  pixels  having  values 
near  0  and  150  .  in  Figure  2,  we  show  two  images  of  tank  1  (M60)  and  the  APC  at  two  different 
aspect  views.  Denoting  the  front  of  the  tank  as  image  1,  tank  2  images  11-15  were  much  dim¬ 
mer  and  tank  1  images  30-34  were  much  brighter. 

Six  or  twelve  image  aspects  per  class  were  selected  for  the  training  set  (all  approxi¬ 
mately  evenly  spaced  in  aspect  angle) .  The  centered  and  four  shifted  images  of  each  object 
were  used  for  the  training  set.  Each  SDF  was  designed  to  recognized  tank  1  and  to  reject 
either  the  APC  or  tank  2.  Intra-class  recognition  and  inter-class  discrimination  were  al¬ 
ways  tested  using  all  72  images  in  the  two  classes.  Each  test  image  was  centered  but  all 
points  in  the  correlation  plane  were  tested  for  the  threshold  T  =  0.5.  The  correct  central 
correlation  peak  value  was  measured  and  compared  to  T  =  0.5.  The  largest  peak  value  any¬ 
where  in  the  correlation  plane  was  also  measured.  C  in  (18)  was  calculated  only  for  this 
largest  peak  point  regardless  of  whether  it  was  >T.  Regardless  of  whether  the  central  point 
was  above  threshold,  C  was  calculated  only  for  tKe  largest  correlation  plane  point.  Errors 
in  the  peak  intensity  are  expected  due  to  aspect  views  not  in  the  training  set,  due  to 
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FIGURE  2 

Representative  images  for  3-D  ATR  testing 
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different  aspect  views  of  different  objects  being  similar  and  due  to  variations  in  the  modu¬ 
lation  level  of  the  training  set  and  test  set  images.  Our  data  represents  very  worse-case 
results.  For  true  targets,  by  evaluation  only  at  one  point,  we  often  miss  a  target,  since  C 
at  the  wrong  point  never  exceeds  Or  (for  a  true  target)  .  For  false  targets  by  calculating  C 
at  the  wrong  point  even  though  the  peak  intensity  there  is  below  T,  we  can  often  misclassify 
a  ta.get.  Thus,  the  data  presented  is  quite  worse-case  and  significant  improvement  is  possible. 

In  Table  3,  we  show  the  data  for  tank  1  -  APC  for  six  aspect  views  per  class.  From  row 
one,  we  see  that  all  correlation  plane  values  are  correct  (below  T)  for  false  targets  and 
only  three  or  five  images  have  central  peak  values  below  T  (for  true  targets) .  This  was 
found  to  be  due  to  low  modulation  of  the  imagery  and  to  aspect  views  quite  different 
from  those  in  the  reference  data  set).  With  T  =  0.4  (0.35)  for  SDF-1  (SDF-2)  all  correct 

peaks  exceeded  T.  No  peak  threshold  was  used  with  SDF-3,  rather  the  largest  several  corre¬ 

lation  plane  peaks  would  be  investigated.  Similar  remarks  apply  to  the  practical  realization 
of  SDF-1  and  SDF-2.  In  row  two,  we  see  that  the  largest  correlation  peak  always  occurs  in 
the  wrong  place  for  a  false  target  (as  expected) ,  but  from  row  three,  at  the  most  only  six  of 
these  peaks  have  C  >  Gj..  In  row  two,  the  largest  peak  is  always  in  the  correct  location 
(for  SDF-1  and  SDF-2) .  The  number  of  C  errors  (C  <  Ct  for  a  tank  input  and  C  >  Ct  for  an 

APC  input)  are  listed  in  the  table.  Errors  in  the  first  case  are  missed  targets.  Errors  in 

the  second  class  are  misclassif ied  objects.  Most  errors  occurred  for  the  same  images  (four 
of  which  were  very  bright)  and  three  of  which  had  aspect  views  significantly  different  from 
those  in  the  training  set).  In  general,  least-squares  SDF-2  performs  comparable  to  SDF-1. 

The  projection  values  were  in  general  lower  for  SDF-2  (especially  for  the  central  (correct) 
peak  value) .  This  is  expected,  since  this  is  an  approximate  image  solution  and  since  only 
three  eigenvectors  are  used  to  represent  each  set  of  training  set  images  for  each  shift. 

SDF-3  performed  worst.  We  might  expect  it  to  perform  better,  since  it  maximizes  C.  Modula¬ 
tion  variations  in  the  training  set  appear  to  be  the  cause  for  its  poorer  performance.  The 
percent  of  all  72  images  correctly  classified  is  noted  and  the  percent  of  the  objects  with 
T  >  0.5  misclassif ied  is  noted  in  the  tables. 


TABLE  3 

Worst-case  performance  of  the  three  SDFs  for  Tank  1/APC  data  (72  images) 

(6  training  set  images  per  class,  approximately  every  £0°,  5  shifted  versions  of  each) 


SDF 

SDF-1 

SDF-2 

SDF-3 

INPUT 

TANK  1  APC 

TANK  1  APC 

TANK  1  APC 

No.  of  Central 

Peak  Errors  T  5  0.5 

3  0 

5  0 

- 

No.  of  Largest 
Peaks  in  Wrong 
Location 

0  All 

0  All 

7  All 

No.  of  C  Errors 

3  1 

2  1 

7  6 

Percent  Correct 

94.4% 

95.8% 

81.9% 

Percent  Wrong 

0% 

0% 

8.3% 

Ct  Threshold 

5.0 

4.3 

3.5 

Table  4  shows  similar  data  for  tank  1  versus  tank  2.  The  trends  are  quite  similar. 
Table  5  shows  data  for  the  case  of  12  training  set  aspect  images  per  class.  The  signifi¬ 
cant  reduction  in  the  number  of  errors  observed  is  due  to  the  fact  that  the  largest  cor¬ 
relation  plane  value  is  now  in  the  correct  location  (for  true  targets) .  As  in  Tables  3 
and  4,  since  C  is  calculated  only  at  the  largest  correlation  plane  point,  if  this  point  is 
wrong  (for  a  correct  target),  then  C  never  exceeds  Ct  and  a  target  is  missed.  In  Tables  3 
and  4,  Ct  was  set  at  1.5  below  the  average  C  value  for  the  training  set  images  in  both 
classes.  In  Table  5,  Ct  was  set  higher  at  0.5  below  the  average  (since  with  more  training 
set  images,  our  confidence  is  higher). 


TABLE  5 

Worst-case  performance  of  three  SDFs  for  Tank  1/APC  data  (72  images) 
(12  training  set  images,  every  30°,  5  shifted  versions  of  each) 


Percent  Wrong 


CT  Threshold 


4 ■  SUMMARY  AND  CONCLUSION 

The  three  new  SDFs  described  represent  3-D  object  information  and  discrimination  informa¬ 
tion  between  3-D  objects  quite  well.  Initial  tests  show  excellent  results.  As  noted,  the 
test  performed  is  quite  worst-case,  since  the  largest  correlation  plane  point  only  was  used 
and  because  of  fluctuations  in  the  modulation  of  the  training  set  imagery. 
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Correlation  Filters  for  Distortion-Invariance  and  Discrimination 

David  Casasent  and  Abhijit  Mahal ano bis 
Carnegie-Mellon  University 
Department  of  Electrical  and  Computer  Engineering 
Pittsburgh,  Pennsylvania  15213 

1.  INTRODUCTION 


Correlators  are  powerful  shift-invariant  object  recognition  systems  that  perform 
well  in  noise.  However,  they  are  quite  sensitive  to  distortions  between  the  input 
and  reference  object.  Synthetic  discriminant  functions  (SDFs)  [1]  accommodate 
intra-class  distortions  and  provide  inter-class  discrimination.  In  Section  2,  we 
review  these  projection  SDFs  and  note  that  they  restrict  only  the  peak  point  in 

the  correlation  plane.  In  Section  3,  new  correlation  SDFs  [2]  are  described. 

They  control  both  the  peak  and  sidelobe  response  and  thus  exhibit  superior 

performance.  Initial  test  data  on  these  SDFs  are  presented  in  Section  4. 

2.  PROJECTION  SDFs 

In  the  synthesis  of  projection  SDFs,  the  SDF  h  is  a  linear  combination  of  the 
training  set  images  {f}  in  classes  1,  2,  etc.,  i.e. 

h(x.y)  *  E  anfn(x,y).  (1) 

n 

The  coefficient  vector  a  that  defines  h  is  given  by 

a  =  R'V  (2) 

where  R  is  the  vector  inner  product  matrix  of  all  {f}  with  class  one  data  {!,} 

being  the  first  N,  images  and  class  2  data  the  next  N2  images,  etc.  The 
elements  of  the  deterministic  vector  u  define  the  filter's  desired  response  for  the 
{f}  data.  With  the  first  N,  elements”  of  u  unity  and  the  next  N2  elements  aero, 
the  SDF  provides  a  "1"  output  for  all  {ft}  and  a  ”0"  for  all  {f2}.  Many  other 
choices  for  u  exist  and  correspond  to  the  various  types  of  SDFs  [  i  ]. 

However,  this  filter  synthesis  only  restricts  the  central  peak  or  correct 
projection  value  in  the  correlation  output.  There  is  no  guarantee  that  the  value  at 

other,  locations  in  the  correlation  plane  will  not  exceed  the  value  at  the  point  of 

registration  (we  refer  to  this  as  the  central  value,  with  no  loss  of  generality  and 

denote  this  value  by  lp).  This  problem  is  particularly  severe  when  the  input  is  a 

false  target  (one  in  class  2)  for  which  a  "0"  output  is  desired.  Another 

shortcoming  of  projection  SDFs  is  that  only  a  simple  correlation  plane  threshold  (T 
s  0.5  or  other  levels)  is  used  to  achieve  object  detection. 

3.  CORRELATION  SDFs 

Projection  SDFs  adequately  control  Ip.  To  control  the  sidelobes,  we  increase 
the  training  set  size  to  include  N8  shifted  versions  of  each  training  set  imgae,  the 

centered  image  and  (Ng-1)  shifted  versions.  We  select  Ns  *  5  and  the  shifted 
images  symmetrically  to  be  d,  pixels  in  both  *x  and  *,y.  Correlation  SDF 
synthesis  still  uses  (1)  and  (2)  with  {fj  and  R  being  larger,  i.e.  with  Nft(N1«N2)  * 
My  training  images  (for  a  two-class  problem).  The  control  vector  u  has  zero- 
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values  for  elements  corresponding  to  all  shifted  versions  of  all  images.  Denoting 
class  1  (true)  objects  by  f,  class  2  (false)  objects  by  £  and  shifted  versions  of 
each  by  primes,  the  filter  requirements  are 

h  •  f,  =  1,  h  •  tv  =  0 

h  •  4  ■  ft  h  •  a’j  ■  0.  (3) 

The  linear  combination  correlation  SDF  =  h  is 

h(x,y)  =  E  a^x.y)  «•  E  ai2gi(x.y)  ♦  raj3fj(x.y)  ♦  raj4gi(x.y),  (4) 

N,  n2 

where  the  last  two  summations  are  over  (Ns-1)Nt  and  (NS-1)N2.  The  vector  inner 
product  matrix  R  is  ^  x  N|  with  the  first  N,  images  being  1L.  The  SDF-h  is 
now  defined  by  the  solution  a  to 

R  a  =  u,  =  [1«»#1  0*m0]t.  (5) 

Vi  i  y  w/Vi'  . . . 

N,  N^N, 

This  correlation  SDF  thus  forces  the  true-class  peak  to  1,  the  false  class 

peak  to  0  and  the  sidelobes  (ds  pixels  from  the  peak)  to  0  for  both  f,  and  flV. 
Thus,  the  true  correlation  peak  will  have  a  well-defined  shape.  False  targets  will 

have  low  response  over  most  of  the  central  correlation  region.  Use  of  more 
training  set  images  with  shifts  2d,,  etc.  can  control  the  full  correlation  plane 
response  This  correlation  SDF  synthesis  concept  first  introduced  in  [2]  is  a 

refinement  of  the  decorrelation  SDF  in  [3].  Other  variations  follow  directly  [2] 
such  as:  a  least  square  solution  (to  reduce  the  dimensionality  of  the  data),  an 
SDF  that  maximizes  the  peak  to  sidelobe  ratio  (PSR)  (rather  than  forcing  the  peak 
and  sidelobes  to  specific  values),  etc.  In  Section  4.  we  present  new  test  data 

on  the  performance  of  these  correlation  SDFs. 

4.  TEST  RESULTS 


To  test  the  performance  of  these  correlation  SDFs,  available  software  that 

produced  images  ot  different  aircraft  at  different  in-plane  rotations  0  and  scales  and 
from  different  viewing  angles  9  (out-of-plane  rotations)  was  used.  We  selected 

two  aircraft  (Set  A.  Class  1  »  Mig,  Class  2  *  DClO;  and  Set  B:  Class  1  » 
Mig,  Class  2  *  Fi05),  d,  =  5  and  8  pixels,  N,  *  5  (and  thus  Nj  »  ION,  where 

N  is  the  number  of  training  set  images  per  class).  We  generated  36  images  (10° 
in-plane  rotation  increments  40  =  10°)  per  class  and  thus  desire  N  <  36.  The 

image  resolution  used  was  128  x  128. 

With  4  *  0,  d,  *  5,  N  *  6  (40  *  60°)  was  used  for  Set  A.  The  6  training 
6et  images  per  class  are  shown  in  Figure  1.  Three  intermediate  images  per  class 
(0  *  15°,  30°,  45®)  are  shown  in  Figure  2.  The  correlation  SDF  was  formed 

(Section  3).  Tests  of  the  full  correlation  plane  data  for  the  training  s$j  data  are 

shown  in  Table  1.  The  value  of  the  correct  peak  lp.  the  largest  peak  lp  and 
PSR  at  both  peaks  are  listed.  For  true  targets  (class  1),  the  correct  and  largest 
peak  coincide.  All  lp  are  1.0  as  expected  and  PSR  is  large  (=*3.88)  and  rather 

constant.  Data  for  the  false  targets  (right  side  of  Table  1)  show  the  expected 

values  (0.0)  at  the  central  peak  and  large  (a0.52)  but  less  than  1.0  peak  values 
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at  other  locations.  As  expected,  PSR  at  these  points  is  less  (200  m^x)  than  for 
true  targets  Data  for  test  set  imagery  (Table  2)  shows  larger  peaks  (lp>0.53) 
for  true  targets  than  for  false  targets  (lp<050)  and  larger  PSR  for  true  targets 
(>2.21)  than  for  most  false  targets  (<2. 1).  The  PSR  *  2  4  value  for  one  false 
target  corresponds  to  an  lp  =  0.35  and  is  thus  easily  distinguished. 

From  these  data,  we  see  that  an  lpT  =  0.5  threshold  alone  provides  100% 
correct  recognition.  The  combination  of  lpT  *  0.5  and  PSRT  =  23  insures  even 
more  reliable  performance.  Because  of  symmetry,  the  three  test  data  in  Table  2 
typify  all  results  Tests  of  4  rotation  effects  were  conducted.  They  are  more 
severe  conditions  and  require  more  training  set  images,  different  d#  and  tighter  ^ 
and  PSRt  thresholds. 


(a)  Class  1  Mig  Training  Set  Images,  4=0.  40=60 


(b)  Class  2  DC  10  Training  Set  Images.  4=0,  40=60 


Figure  1:  Training  Set  Images  Used 


(a)  Class  1  (b)  Class  2 

Figure  2:  Three  Typical  Test  Images  Per  Class 


CLASS  1  (MIG)  TRUE  CLASS 


Ip  =  Ip 

PSR 

1.00 

3.90 

1.00 

3.86 

1.00 

3.94 

1.00 

3.88 

1.00 

3.81 

1.00 

3.84 

.CLASS_Z_,(DC10)_FALSE  CLASS_ 
LOCATION  1  'PSR 


!P 

*p 

0.0 

ESI 

0.0 

iS9 

0.0 

0.61 

0.0 

0.54 

0.0 

0.43 

0.0 

0.51 

Table  1:  Peak  Intensity  lp,  Largest  and  PSR  (Training  Set  Data) 
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r  CLASS  1  (TRUE  CLASS)  INPUT 

i 

CLASS  2  (FALSE  CLASS) 

INPUT 

Q 

PSR 

LOCATION 

m 

PSR 

■ 

!P 

PSR 

LOCATION 

■a 

1 

■ 

2.4 

3.3 

2.4 

i 

3.1 

2.2 
3.0 

i 

1 

2.1 

1.6 

2.4 

None 

None 

None 

Table  2:  Typical  Test  Image  Data 
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The  modeling  of  system  and  component  noise  and  error  sources  in  optical  linear  algebra  processors  (OLAP’s)  are 
considered,  with  attention  to  the  frequency-multiplexed  OLAP.  General  expressions  are  obtained  for  the  output 
produced  as  a  function  of  various  component  errors  and  noise.  A  digital  simulator  for  this  model  is  discussed. 


Optical  linear  algebra  processors  (OLAP’s)  represent 
a  most  attractive  class  of  general-purpose  optical  pro¬ 
cessors  with  parallel  and  real-time  features.1  The  fre¬ 
quency-multiplexed  OLAP2  is  easily  fabricated,  permits 
a  competitive  high  computation  rate,  and  with  different 
data-encoding  schemes  allows  all  the  basic  operations 
of  linear  algebra  functions  to  be  performed  with  excel¬ 
lent  pipelining  and  flow  of  data.3  We  thus  emphasize 
this  architecture  in  our  present  study.  Many  OLAP’s 
that  operate  on  digital  data  have  also  been  suggested.1 
These  systems  achieve  the  accuracy  of  a  digital  pro¬ 
cessor  together  with  the  speed  and  parallel-processing 
advantages  of  optical  systems.  Despite  this  widespread 
interest,  little  attention4  has  been  given  to  an  analysis 
and  modeling  of  the  various  noise  and  error  sources  in 
such  optical  architectures.  We  briefly  review  the  fre¬ 
quency-multiplexed  OLAP  and  the  basic  linear  algebra 
operations  required.  Then  we  detail  the  types  of  errors 
possible  in  such  a  processor  and  derive  our  model  for 
noise-  and  error-source  effects  in  OLAP’s  and  the  ex¬ 
pression  for  the  output  obtained  as  a  function  of  the 
various  system-component  noise  and  errors.  We  dis¬ 
cuss  digital  simulation  of  this  model  and  its  use.  The 
modeling,  simulation  procedure,  and  general  approach 
that  we  use  are  valid  for  most  OLAP’s,  including  digi¬ 
tal-optical  linear  algebra  processors. 

A  simplified  diagram  of  the  frequency-multiplexed 
OLAP  is  shown  in  Fig.  1.  This  architecture  consists  of 
N  input  point  modulators  imaged  through  N  separate 
regions  of  an  acousto-optic  (AO)  cell  (with  each  region 
separated  by  a  bit  time  Tg).  The  AO  cell  is  fed  with  N 
l-D  input  signals,  each  on  a  different  temporal -fre¬ 
quency  carrier.  We  view  these  signals  as  N  vectors, 
each  on  a  spatial  carrier.  The  light  intensity  distribu¬ 
tion  leaving  the  cell  is  then  the  products  of  the  input 
vector  (from  the  point  modulators)  and  the  N  vectors 
in  the  cell,  with  each  such  product  leaving  the  cell  at  an 
angle  proportional  to  the  input  frequency  to  the  cell. 
The  Fourier-transform  (FT)  lens  sums  the  elements  of 
each  vector  product  (by  space  integration)  and  forms 
each  of  the  N -vector  inner  products  on  a  separate  out¬ 
put  detector.  The  detector  output  voltages  (or  cur¬ 
rents)  are  thus  proportional  to  the  (N  X  N)  matrix- 
vector  product,  with  one  matrix-vector  multiplication 
performed  each  Tg. 

If  intensity-mode  operation  is  used,  the  signals  to  be 


processed  are  present  on  a  bias.  The  effects  of  these 
bias  terms  in  the  output  data  must  be  removed  and 
corrected  for.  The  necessary  correction  signals  can  be 
easily  obtained  with  a  separate  adjunct  processor 
channel  similar  to  the  way  in  which  bias  was  corrected 
in  the  initial  optical  matrix-vector  processors  using 
two-dimensional  masks.  Amplitude-mode  operation 
of  the  AO  cells  and  the  system  is  also  possible  and  in 
some  cases  preferable.  In  the  conventional  system,  the 
detected  output  intensity  will  be  the  square  of  an  am¬ 
plitude  product,  and  thus  the  square  root  of  the  input 
(or  output)  data  must  be  produced.  Methods  to  achieve 
this  exist,  but  coherent  detection  at  the  output  is  pref¬ 
erable.  In  this  case,  the  detector  output  voltage  is 
proportional  to  the  desired  amplitude  product  Either 
mode  of  operation  requires  attention  to  the  choice  of 
frequencies  and  their  separation  to  ensure  linearity  and 
suppression  of  cross  talk.  The  effects  of  intermodula¬ 
tion-induced  cross  talk  require  further  examination. 

No  delays  exist  in  this  processor  since  data  flow 
continuously,  as  detailed  elsewhere,3  even  though  the 
same  matrix  remains  in  the  AO  cell  for  NTg.  With 
different  space  (x),  time  (t),  and  frequency  if)  encoding, 
matrix  data  can  be  processed  by  the  system,  and  various 
matrix-vector,  matrix-matrix,  and  matrix-matrix- 
matrix  multiplications  and  iterative  and  direct  solutions 
of  systems  of  linear  algebra  equations  can  be  realized.3 
The  basic  operation  performed  by  the  system  is  thus  a 
matrix-vector  product  each  Tg.  This  is  the  basic 
building  block  of  all  other  matrix  operations  and  direct 
and  indirect  solutions  of  linear  and  nonlinear  algebraic 
equations.3  In  this  Letter,  we  describe  our  noise-  and 
error-source  modeling  of  the  frequency-multiplexed 
OLAP  in  terms  of  this  basic  Ab  *  c  system  opera¬ 
tion. 


Fig.  1.  Simplified  schematic  of  a  frequency-multiplexed 
optical  linear  algebra  processor.  (After  Ref.  2.) 
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The  basic  architecture  of  most  OLAP’s  consists  of  a 
linear  array  of  input  point  module  to.  s,  an  AO  cell,  and 
a  detector  array.  Thus  limiting  our  modeling  to  the 
system  of  Fig.  1  is  not  overly  restrictive.  In  the  initial 
modeling,  we  assume  ideal  lenses,  no  dispersion,  and  no 
cross  talk.  This  yields  a  useful  closed -form  expression 
for  the  effect  of  errors,  which  provides  useful  insight. 
Accommodating  other  effects  and  more  advanced 
component  errors  in  the  simulator  is  discussed  below. 
In  the  system  of  Fig.  1,  various  input-plane  (point- 
modulator)  errors  are  possible.  These  include  varia¬ 
tions  in  the  bias  level  or  level  of  lasing  for  the  input 
modulators  and  variations  in  the  response  of  each  input 
point  modulator.  Acoustic  attention  of  the  signal  in  the 
cell  produces  a  deterministic  taper  exp  (—ax)  across  the 
length  x  of  the  AO  cell,  where  a  is  the  attenuation 
constant  of  the  AO  cell  material  used.  For  now,  we 
assume  that  a  is  small  and  nondispersive.  These  spa¬ 
tial  errors  plus  variations  in  the  spatial  response  of  the 
AO  cell  owing  to  imperfections  in  the  AO  material  or  the 
transducer  used  can  also  be  modeled  as  input-plane 
errors.  These  spatial  errors  are  correctable  and  can  be 
reduced  to  low  residual  levels  by  adjusting  the  input 
signals  to  the  point  modulators  and  the  AO  cell  or  by  use 
of  a  correction  mask  in  front  of  the  AO  cell  (the  a  error 


6,  -  mi  +  «n» + 6# + «a»i.  (i) 

Similarly,  the  actual  and  ideal  transmittance  of  the 
matrix  data  in  the  AO  cell  for  element  j,  i  ( i  denotes 
space  and  j  denotes  frequency)  are  related  by 

*  O/i 1 1  +  6w]H(//)exp(-axl),  (2) 

where  x,  denotes  the  distance  of  the  ith  data  block  from 
the  AO  cell’s  transducer.  Likewise,  the  elements  of  the 
observed  and  ideal  detector  plane  outputs  6  and  s  are 

gj  =  S/[l  +  5}3))  +  d;  +  rtj(t).  (3) 

We  combine  all  spatial  errors  (subscript  i)  into  the 
single  variable 

ii  =  +  *51*  +  ill*  +  6<2>  +  5[2).  (4) 

Combining  Eqs.  (1)— (4)  and  assuming  all  error  sources 
to  be  small,  the  elements  (j  of  6  are 

(j  *  Li  ojib,  (1  +  \p,)(l  +  5/  )H  (fi  )exp(  -  ax, ) 

+  d;  +  n/(t ).  (5) 

To  provide  a  more  vivid  description  relating  fc  to  c 
and  the  various  system  and  component  noise  and  errors, 
we  detail  Eq.  (5)  for  a  2  X  2  matrix  as 


[l  +  6f  o  1  [7/(/i)  o  1  fan  o,2] 

“L  0  l  +  6fj  L  0  H(/2)J  Lq2i 

fl  +  tf2  0  1[exp(-ax2)  0  1  fbil  .  f rfi"|  pMO]  .  (6) 

|_  0  i  +  V'iJL  0  exp(-ax!)J  |_b2J  LMOJ 

To  provide  further  insight,  we  explicitly  describe  each 
error-matrix  term  in  Eq.  (6)  by  its  associated  origin, 
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effect  can  be  corrected  only  at  one  frequency,  however). 
As  noted  above,  all  spatial  errors  in  the  AO  cell  can  be 
mapped  into  spatial  input-plane  errors.  Similarly,  any 
frequency-dependent  AO  cell  errors  can  be  mapped  to 
the  output  plane  (since  the  FT  lens  converts  frequency 
in  the  AO  plane  into  position  in  the  detector  plane). 
The  output  detector  plane  errors  thus  include  variations 
in  the  frequency  response  H(/)  of  the  AO  cell,  variations 
in  the  spatial  response  of  the  individual  output  detec¬ 
tors,  variations  in  the  dark  current  of  the  individual 
detectors,  and  time-varying  detector  noise.  The  effect 
of  these  last  two  detector  plane  errors  on  the  system 
output  is  additive  rather  than  multiplicative,  as  we  will 
shortly  demonstrate. 

In  Table  I,  we  summarize  the  notation  used  and  the 
various  input,  AO,  and  detector  plane  errors.  We  also 
include  errors  that  describe  spatial  variations  in  the 
coupling  between  the  inputs  and  the  AO  cell.  With  this 
formulation  and  notation,  the  elements  6,  of  the  actual 
input  vector  are  related  to  the  elements  6,  of  the  ideal 
input  vector  by 


When  acoustic  attenuation  is  small,  Eq.  (6)  is 


where  the  spatial  and  temporal  errors  are  now  additive. 
From  Eqs.  (7)  and  (8),  and  the  fact  that  all  OLAP  spa¬ 
tial  errors  can  be  reduced  to  the  desired  residual  levels 
by  correction,  detector  noise  and  the  dispersive  nature 
of  a  are  potentially  the  most  dominant  error  sources. 
If  a  effects  are  not  small,  then  the  decoupling  in  Eq.  (8) 
does  not  occur,  the  various  spatial  and  detector  plane 
errors  can  still  be  grouped  and  combined  as  in  Eq.  (7), 
but  the  simplified  form  in  Eq.  (8)  does  not  result. 

To  quantify  system  performance  and  the  effect  of 
each  noise-  and  error-source  component  in  the  OLAP 
for  a  given  operation,  computer  simulation  is  required. 
The  error  sources  are  quite  different  from  those  typi¬ 
cally  treated  in  analysis  of  conventional  linear  algebra 
processors.5  We  now  briefly  discuss  how  we  digitally 
model  the  various  error  sources  in  Eqs.  (6)  and  (7). 
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Table  1.  SAOP  Error  Source  Model 


Error  Source 

Notation 

Spatial  errors 

Subscript  i 

Frequency  errors 

Subscript; 

Input  plane  errors 

Superscript  1 

AO  cell  errors 

Superscript  2 

Detector-plane  errors 

Superscript  3 

Input  Plane  Errors 

Notation 

Point  modulator 

Spatial  gain 

Bias  nonuniformity 

i  +  sh> 

Coupling  (spatial) 

i+4* 

AO  Cell  Plane  Errors 

Notation 

Amplifier  errors 

1  +  S<2> 

Spatial  response 

1  +  Sf 

AO  transfer  function 

«(/,) 

Acoustic  attenuation 

exp(-ax;) 

Detector  Plane  Errors 

Notation 

Spatial  response 

1  +  «)3) 

Dark  current 

di 

Time-varying  noise 

nj(t) 

From  experiments  on  our  laboratory  OLAP  systems, 
we  found  that  the  residual  spatial  errors  and  the  de¬ 
tector  noise  can  be  modeled  as  zero-mean  Gaussian 
random  numbers  and  that  signal-dependent  (quantum) 
noise  is  not  present.  The  frequency  response  H  (f)  and 
the  acoustic  attenuation  can  be  modeled  as  determin¬ 
istic  errors  that  are  quantified  by  measurements  on  the 
OLAP.  This  deterministic  function  multiplies  the 
matrix  data  in  the  cell  as  in  Eq.  (6).  Since  the  spatial 
errors  are  independent  of  time,  the  random  numbers 
representing  each  such  error  are  generated  once  by 
standard  IMSL6  or  other  software  and  stored.  The  3<r 
standard  deviation  of  each  random  number  is  chosen 
to  equal  the  percentage  error  to  be  modeled.  For  input 
and  AO  cell  spatial  errors,  the  random  numbers  are  in¬ 
cluded  in  each  input  vector  datum  b  each  Tb,  and  for 
detector  spatial  errors  the  associated  random  numbers 
are  added  to  the  computed  output  vector  each  Tb  as  in 
Eq.  (6).  For  fixed  or  spatial  errors,  the  same  set  of 
random  numbers  is  used  at  each  Tb-  To  simulate  de¬ 
tector  noise,  a  new  set  of  uncorrelated  variables  with 
Gaussian  probability  distribution  is  generated  each 
Tb- 

The  model  above  and  the  form  of  the  result  in  Eqs. 
(6)-(8)  are  useful  for  conveying  error  effects  in  closed 
form,  for  showing  how  various  error  sources  can  be 
grouped,  and  for  noting  which  error  sources  are  cor¬ 
rectable,  multiplicative,  and  additive.  Other  error 
sources  and  other  models  for  the  various  components 
can  be  included  directly  in  the  simulator  [but  do  not 
lend  themselves  to  convenient  diagonal  matrices  as  in 
Eq.  (6)  and  to  a  closed -form  expression  for  the  system]. 
Variations  in  the  bias  level  of  the  point  modulators  and 
all  errors  are  assumed  to  be  small  residual  errors  (after 
correction).  Thus  bias-level  variations  are  included  in 
>£,.  If  such  individual  errors  are  not  small,  performance 
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will  be  too  poor  to  consider.  A  primary  purpose  of  our 
initial  model  and  its  simulator  is  to  quantify  the  domi¬ 
nant  error  sources  and  the  magnitude  allowed  for  each 
(i.e.,  the  level  to  which  fixed  spatial  errors  must  be 
corrected  and  the  amount  of  noncorrectable  errors  al¬ 
lowed). 

For  quantitative  performance  data,  other  advanced 
models  can  be  used.  Exact  transfer  curves  (after  cor¬ 
rection)  for  each  point  modulator  and  detector  can  be 
measured  and  used  in  the  actual  simulator.  We  have 
done  this  and  found  the  results  (for  the  small  residual 
errors  present  in  practice)  to  be  the  same  as  those  ob¬ 
tained  using  our  random  variable  modeling.  To  include 
the  dispersive  nature  of  a,  a  different  exp(-ax)  factor 
is  used  for  each  signal  in  the  AO  cell.  This  is  a  fixed 
factor  (different  for  each  frequency  signal)  that  mul¬ 
tiplies  the  present  spatial  contents  of  the  cell  each  Tb- 
Our  simulator  includes  this  feature,  but  it  is  not  con¬ 
veniently  included  in  the  equation  formulations  above. 
Similar  remarks  apply  to  cross-talk  effects  in  the  AO  cell 
and  to  the  electronic  circuit  models. 

From  detailed  simulations  and  analyses  with  the 
model  in  Eq.  (6),  we  found  that  acoustic  attenuation  and 
detector  noise  are  the  dominant  error  sources.  In  initial 
simulations,4  we  found  that  a  effects  are  dominant  in 
iterative  algorithms  and  detector  noise  is  dominant  in 
direct  algorithms.  We  also  found  that  the  effects  of 
small  multiple-error  sources  are  additive  as  in  Eq. 
(8). 

The  various  error  sources  that  arise  in  an  OLAP  have 
been  tabulated  and  grouped  into  two  classes  (correct¬ 
able  or  fixed  and  time-varying)  and  classified  according 
to  the  plane  (input,  AO  cell,  output  detectors)  in  which 
they  originate.  Combining  these  separate  error  sources, 
we  find  that  error  matrices  in  systolic  processors  are 
multiplicative  and  that  acoustic  attenuation  is  an  im¬ 
portant  error  source  in  OLAP’s  employing  AO  cells. 
The  model  and  simulation  technique  advanced  can  and 
should  be  applied  to  other  OLAP’s  to  quantify  the 
dominant  error  sources,  the  effect  of  multiple  errors, 
and  the  performance  to  be  expected  from  each  system 
for  each  application  and  algorithm. 
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ABSTRACT 

An  iterative  algorithm  for  the  solution  of  a  quadratic  matrix  equation  (the  a-c 
Ricatti  equation)  is  detailed.  This  algorithm  is  unique  in  that  it  allows  the  sol 
a  nonlinear  matrix  equation  in  a  finite  number  of  iterations  to  a  desired  accuracy 
retical  rules  for  selection  of  the  operation  parameters  and  number  of  iterations  re 
advanced  and  simulation  verification  and  quantitative  performance  on  an  error-free 
are  provided.  An  error  source  model  for  an  optical  linear  algebra  processor  is  t.v 
vanced,  analyzed  and  simulated  to  verify  and  quantify  our  performance  guidelines, 
parison  of  iterative  and  direct  solutions  of  linear  algebraic  equations  is  then  pi 
Experimental  demonstrations  on  a  laboratory  optical  linear  algebra  processor  are  : 
for  final  confirmation.  Our  theoretical  results,  error  source  treatment  and  guide 
appropriate  for  digital  systolic  processor  implementation  and  for  digital-optical 
analysis . 


1 ■  INTRODUCTION 

Optical  linear  algebra  processors  (OLAPs)  represent  a  most  general  and  attractive  use  cf 
the  parallelism  and  real-time  processing  features  of  optical  systems  [1).  The  frequency- 
multiplexed  acousto-optic  (AO)  processor  [2,3]  of  Figure  1  represents  a  most  general-purpose 
OLAP  architecture  with  ease  of  fabrication  [4]  and  competitive  computational  rates  [2,4]. 

In  this  architecture  (Figure  1),  N  point  modulator  inputs  are  imaged  through  N  separate 
regions  of  an  AO  cell.  These  individual  regions  are  separated  by  Tg  of  time  (for  propagation 
of  the  acoustic  wave)  and  by  a  physical  distance  djj .  In  [2],  the  use  of  this  processor  in 
iterative  algorithms,  direct  LU  and  QR  matrix  decomposition  algorithms,  and  triangular  sys¬ 
tem  solutions  was  detailed. 
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FIGURE  1 

Simplified  schematic  of  a  frequency-multiplexed  optical  linear 
algebra  processor  [3] 

In  this  paper,  we  consider  the  use  of  this  processor  for  the  solution  of  a  nonlinear  matrix 
equation  (Section  2) .  The  specific  application  chosen  is  the  solution  of  the  algebraic 
Ricatti  equation  (ARE).  This  nonlinear  equation  is  similar  to  the  expressions  to  be  skived 
in  Kalman  filtering  and  other  advanced  modern  signal  processing  algorithms.  An  iterative 
solution  is  necessary  for  such  problems  and  for  eigensystem  solutions.  Our  proposed  non¬ 
linear  ARE  solution  is  quite  unique  since  it  requires  a  finite  number  of  steps  to  achieve 
a  specific  accuracy  and  performance.  In  Section  3,  we  summarize  selection  of  the  operation¬ 
al  parameters  for  such  an  iterative  algorithm  and  the  theoretical  basis  for  our  choice  of 
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the  fixed  number  of  iterations  to  be  used.  Section  4  presents  initial  error-free  simulation 
data.  In  Section  5,  we  advance  our  error  source  model.  In  Section  6,  we  review  our  itera¬ 
tive  and  direct  solutions  to  systems  of  linear  algebraic  equations  (LAEs) .  This  represents 
the  fundamental  operation  required  in  advanced  linear  algebra  algorithms.  Section  7  con¬ 
tains  simulation  data  to  quantify  the  dominant  error  sources  and  the  accuracy  expected  fren 
such  algorithms.  We  conclude  in  Section  8  with  the  experimental  verification  and  quantifi¬ 
cation  of  our  theoretical  results.  Our  summary  and  conclusions  are  then  advanced  in  Section. 
9  . 

2.  NONLINEAR  MATRIX  SOLUTION 

In  reference  |5),  we  detailed  a  solution  to  the  linear  quadratic  regulator  control  pror]-:r  t: 
minimize  a  quadratic  performance  index  for  a  linear  system.  Computation  of  the  regulate r 
feedback  gain  matrix  K  that  defines  the  optimal  controls  u  involves  the  solution  cf  the  /-.Kb 

S  F  +  FTS  “SLS+Q=0  (1) 

for  S.  To  achieve  this,  we  used  the  Kleinman  algorithm  [5]  and  the  solution  cf  the  vt-ctcr- 
ized— Lvatancv  equation  to  format  the  solution  of  (1)  as  a  solution  of  the  set  of  LAEs 

H(k)  s  (k)  =  ^<k>  ,  121 

where  s  and  v  are  the  vectorizations  of  S  and  S L S  -  0  respectively  and  H  is  a  Kronecker  for- 
matted~matnx.  This  system  of  LAEs  must  be  solved  successively  with  different  matrices  H 
and  vectors  '£  with  the  results  of  one  cycle  used  to  compute  the  matrix  H  and  vector  v  for 
-he  next  cycle.  To  achieve  this,  we  employ  a  two-loop  iterative  algorithm  described  by 

s(r+l,k)  =  [I  -  „ (k)H(k) ]s(r,k)  +  -  (k)j-(k)  .  (3) 

In  solving  (2)  using  (3) ,  we  solve  (2)  for  one  outer  loop  iteration  k,  update  H  and  v  and 
solve  the " next  LAE.  This  procedure  continues  until  s  is  of  sufficient  accuracy.  The  algo¬ 
rithm  m  (3)  implies  an  iterative  solution  for  each  LAE.  Direct  solutions  are  also  possible 
as  we  discuss  in  Sections  6  and  7.  The  indices  r  and  k  in  (3)  refer  to  Richardson  (inner) 
ar.d  Kleinman  (outer)  loop  iterations  respectively. 

3.  OPERATIONAL  PARAMETER  SELECTION 

In  an  iterative  algorithm  such  as  (3),  various  operational  parameters  must  be  selected. 
The  initial  selection  s(0,0)  for  S  and  the  choice  s(0,k)  for  each  LAE  solution  are  required. 
For  s(0,0),  we  use  0  to  insure  outer  loop  convergence  (a  stability  matrix).  For  £(C,k),  we 
use  The  obvious  choTce  of  the  prior  £(0,k-l)  estimate.  The  acceleration  parameter  *  in  (3) 
is  chosen  to  be  =  n/ 'max  ;  3/  H(k)  .  This  insures  inner  loop  convergence  [2,5].  Stop¬ 
ping  the  inner  loop  lerations  (index  r)  for  each  LAE  solution  and  stopping  the  number  of 
outer  loop  iterations  (index  k)  is  a  maior  decision. 

In  reference  [5],  we  derived  bounds  for  the  inner  loop  error,  the  outer  loop  error  ar.d 
their  coupling.  From  this  analysis,  we  derived  the  selection  of  a  fixed  number  of  inner 
loop  iterations  R  tc  solve  each  LAE  given  by 

R  =  nC  =  C  log  u  =  1.5C  to  3.0C.  (4) 

where  x*(0)-x*(l)  <  a  and  [1  -  1/C]R  s  exp(-n)  <  l/i  is  chosen.  This  follows  from  our 

analysis  cf  the  error  In  an  iterative  solution  (cue  to  a  fixed  number  of  iterations  R) , 
which  showed  that  the  norm  of  such  an  error  is 

s ( r ,  k )  -  s*  =  I  -  H(k)  r  =  (1  -  1/C(k))r,  (r) 

where  C  is  the  condition  number  of  H.  Since  r  is  expected  to  increase  with  C,  we  set  r  =  r.C 
and  thus  select  n  such  that  the  error  between  the  computed  solution  s  and  the  exact  solution 
s*  in  (S)  is  as  small  as  is  required.  For  the  fixed  number  of  outer- loop  iterations  K,  we 
use  K  =  5  or  6,  which  can  be  theoretically  derived  (and  appropriately  modified)  for  other 
applications  with  matrices  with  specific  features.  These  iterative  operational  parameter 
selections  are  summarized  in  Table  1. 

4,  ERROR-FREE  SIMULATION  RESULTS 

The  performance  measures  we  adopted  to  assess  performance  of  the  algorithm  in  Section  2 
implemented  using  the  operational  parameters  in  Table  1  are  the  maximum  percent  error  in  any 
element  of  the  matrix  K  (i.e.  AKmaxl)  and  the  maximum  error  in  the  location  of  the  closed- 
loop  poles  of  the  system  (L’max%) .  We  expect  IK  >>  b‘  and  note  that  b  ■  is  the  more 
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appropriate  error  measure  for  this  specific  application  and  that  similar  error  measures 
should  be  used  to  evaluate  the  performance  of  other  specific  case  studies.  In 

Figures  2  and  3,  we  show  the  variation  of  these  two  error  measures  with  the  number  of  outer 
loop  iterations  k  for  a  fixed  number  of  inner  loop  iterations  for  two  case  studies.  These 
case  studies  are  the  fifth  (Figure  2)  and  third  (Figure  3)  order  models  of  an  F100  engine. 
As  seen  from  the  data  for  these  two  case  studies,  the  use  of  a  fixed  number  of  iterations 
results  in  a  monotonic  decrease  in  the  solution  error  with  the  f.K  error  being  approximate!-.- 
ten  tines  that  of  the  i1.  error.  From  these  results,  we  conclude  that  the  use  of  a  fixed 
number  of  iterations  can  yield  adequate  results  when  the  number  of  iterations  is  properly 
chosen.  Our  parameter  selection  guidelines  in  Table  1  have  thus  all  been  verified  arc  dis¬ 
cussed  . 

TABLE  1 

Operational  Parameter  Selection  Guidelines  [5] 


SYMBOL 

PARAMETER 

PREFERRED  CHOICE 

s  (  C ,  0) 

Initial  Initialization 

s(0,0)  =  0 

s (0,k) 

k-th  Kleinman  Loop  Initialization 

s  (0,k)  =_s(0,k-l) 

R 

Number  of  Inner  Loop  Iterations 

R  =  1 . 5C  to  3  .  OC 

K 

Number  of  Outer  Loop  Iterations 

K  =  5  -  6 

u:  (k) 

Acceleration  Parameter 

w(k)  =  3/ !  j H (k) 

FIGURE  2 

Variation  of  the  error  measures  AKjt,ax(LI 
and  d*maxf*>  with  the  number  of  outer- 
loop  iterations  K  for  different  inner- 
loop  iteration  stopping  criteria  for  the 
fifth-order  HPG3  F100  model 


12  3  4  5  6  7 

NO.  OF  OUTER  LOOPS 

FIGURE  3 

Variation  of  the  error  measures  iKri3>;  ( 1 ) 
and  L'max(%)  with  the  number  of  outer- 
loop  iterations  X  for  different  inner- 
loop  iteration  stopping  enter:  a  for  the 
third-order  HPG3  F10Q  model 


5,  ERROR  SOURCE  MODEL 

In  earlier  publications  [7,8]  we  detailed  the  first  system  and  component  error  source 
model  for  an  OLAP  and  the  general  issue  of  errors  in  such  an  architecture.  In  this  secticr., 
we  review  this  OLAP  error  source  model.  In  this  model,  we  distinauish  input,  AO  cell  and 
detector  plane  errors  separately.  Spatial  errors  include:  input  and  detector  response 
variations  and  errors  in  the  interconnections  between  the  input  modulators  and  the  AO  cell, 
and  detector  dark  current.  The  spatial  variations  are  fixed  (time-independent)  and  are 
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correctable  to  small  residual  levels  as  required  (by  adjusting  the  gain  of  the  input  point  modu¬ 
lators,  the  detector  amplifiers,  and  the  input  matrix  and  vector  data).  Detector  noise  is 
the  only  time-varying  error  source  considered.  Acoustic  attenuation  produces  a  deterministic 
exponential  variation  of  the  data  in  the  AO  cell.  This  effect  is  dispersive,  but  its  fre¬ 
quency  dependence  is  not  included  in  our  present  model.  Acoustic  attenuation  can  be  corrected 
at  one  freauency  and  is  thus  an  inppt  spatial  error.  The  product  of  an  input  matrix  A  and  vector 
b  thus  yields  a  final  output  d  given  by 


d 


Detector 

Spatial 

Response 

Variations 


AO  Cell 

m  m 

Frequency 

Data 

Response 

Variations 

AO  Cell 
Attenuation 


Point  Mod 
Response  and 
Interconnection 
Variations 


Data 

Vector 

b 


Detector 

Dark 

Current 


Time- 

Varying 

Detector 

Noise 


As  seen,  the  different  types  of  system  and  component  variations  are  described  by  error  ma¬ 
trices  that  multiply  the  input  data  vector  or  input  matrix  data.  Thus,  the  system  errors 
are  described  by  the  corresponding  variations  in  the  data  matrix  and  vector.  The  detector 
dark  current  and  noise  appear  additively  in  the  output  vector  as  shown  in  Eq.(6). 


6.  DIRECT  AND  INDIRECT  SOLUTIONS  OF  LAEs 


The  solution  of  a  system  of  LAEs,  A  x  =  b  is  the  fundamental  operation  required  in  most 
linear  algebra  processors  and  signal  processing  applications.  Thus,  we  concentrate  on  this 
function.  The  two  major  types  of  LAE  solutions  are  direct  or  matrix  decomposition  solution 
and  an  iterative  or  indirect  solution. 

The  preferable  iterative  algorithm  is  [2,9] 

x(r+l)=x(r)+w[b-Ax(r)],  (7) 

where  *  is  an  acceleration  parameter  chosen  to  insure  convergence.  The  iterations  (described 
by  the  iterative  index  r)  continue  until  x(r)  =  x(r  +  l).  Then,  (7)  reduced  to  A  x  =  b  and 
the  system's  output  x  is  the  desired  solution.  To  implement  (7)  on  the  system  of  Figure  1, 
the  matrix  data  is  fed  to  the  AO  cell  one  column  at  a  time  in  parallel  with  the  rows  of  the 
matrix  frequency-multiplexed,  i.e.  with  the  matrix  elements  amn  encoded  in  time  and  fre¬ 
quency  as  a(f,t)  and  with  the  vector  data  x  spatially-multiplexed  as  x(x)  and  fed  in  parallel 
to  the  input  point  modulators.  The  matrix-vector  product  A  x  is  formed,  operated  upon  in 
analog  or  digital  post-processing  electronics  to  produce  the  right-hand  side  of  (7)  and  hence 
the  new  x  iterate  input  to  the  point  modulators.  Thus,  the  detector  output  is  fed  back  to 
the  input  point  modulators.  The  length  of  the  AO  cell  NTg  is  chosen  to  be  just  as  suffi¬ 
cient  to  accommodate  the  matrix  data.  Each  Tg,  as  one  column  of  the  matrix  leaves  the  AO 
cell,  it  is  reintroduced  into  the  bottom  of  the  cell.  This  recycling  of  the  matrix  data  is 
more  efficient  for  system  fabrication  and  reduces  the  effects  of  acoustic  attenuation. 

In  direct  solutions,  the  matrix  A  and  the  vector  b  are  multiplied  by  a  decomposition 
matrix  Pj  to  generate  new  Aj  and  bi .  Each  such  matrix-matrix  and  matrix-vector  multiplica¬ 
tion  generates  one  row  of  the  final  A'  matrix  and  one  element  of  the  final  b'  vector. 

After  each  matrix-matrix  multiplication,  the  order  of  the  matrix  and  vector'are  reduced  by 
one  and  the  new  reduced  order  Aj  and  bj  are  multiplied  by  a  new  P_2 ■  This  procedure  is  re¬ 
peated  N-l  times  (for  an  N  x  N  matrix)  and  yields  a  new  upper-triangular  matrix  l'  and  a  new 
vector  b' .  This  simplified  upper-triangular  system  of  equations  U  x  =  b'  is  then  easily 
solved  by  back-substitution.  The  matrix-decomposition  can  be  realized  either  as  an  LU  de¬ 
composition  (this  is  the  technique  we  use  when  the  matrix  is  positive-definite  or  diagonally- 
dominant,  as  is  the  case  here,  since  pivoting  is  then  not  required)  or  as  a  QR  orthogonal 
decomposition  (this  technique  is  more  general  and  stable,  but  is  more  difficult  to  realize). 
The  detailed  implementation  of  LU  [2,10]  and  QR  [2,11]  decomposition  and  back-substitution 
[2,12]  have  been  described  elsewhere.  To  implement  the  Gaussian-elimination  algorithm  (LU) 
used  in  the  present  application  on  the  system  of  Figure  1,  we  feed  one  row  of  the  matrix  A 
to  the  AO  cell  in  parallel  (with  the  columns  of  A  frequency-multiplexed,  i.e.  with  the 
elements  amn  of  A  frequency  and  time  encoded  as  a(t,f))  and  with  one  row  of  the  decomposition 
matrix  P-j  fed  to  the  input  point  modulators  in  parallel  (with  the  elements  pmn  of  £  time 
and  space  encoded  as  p ( t , x ) ) .  To  facilitate  data  flow  and  for  speed,  we  simultaneously 
operate  on  A  and  b  by  using  an  augmented  matrix.  One  row  of  the  augmented  matrix  A'  is 
produced  in  parallel  as  a'(t,x)  on  the  output  detector  during  each  of  the  N  cycles.  The  new 
Pj  matrix  is  easily  calculated  from  the  elements  of  the  j-th  column  of  the  augmented  matrix 
in  dedicated  electronics. 

7,  SYSTEM  ERROR  EFFECTS  ON  THE  SOLUTION  OF  LAEs 

The  direct  solution  requires  an  AO  cell  of  twice  the  length  of  the  matrix,  but  achieves 
the  decomposition  in  a  fixed  number  of  steps.  However,  as  noted  in  Section  3,  iterative 
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algorithms  can  be  operated  with  a  fixed  number  of  iterations  to  achieve  a  given  desired 
accuracy  and  iterative  algorithms  are  essential  [2]  for  eigen-systems  solutions  and  the 
solution  of  nonlinear  matrix  equations  such  as  the  ARE  [5)  and  in  Kalman  filtering  !13j. 
our  new  results  (Sections  7  and  8), we  compare  [6]  the  performance  of  direct  and  iterative  al 
rithms  in  the  solution  of  the  LAEs  that  arise  in  a  specific  ARE  solution  for  the  FI  CO  em 
The  two  cases  considered  are  third  and  fifth-order  F100  models.  These  give  rise  tc  9  x  9 
and  25  x  25  matrices.  Bipolar  data  is  handled  by  space-multiplexing  13]  and  this  doubles 
the  size  of  the  matrices  and  vectors  required.  For  the  third-order  problem,  C  =  2. IE,  th 
dynamic  range  is  47.7  and  from  (5),  j  =  10  iterations  are  required  to  solve  each  LAE.  Fc 
the  fifth-order  problem,  C  =  56.9,  the  matrix  dynamic  range  is  1117  and  from  (5),  j  =  1  g, 0 
iterations  are  required  to  solve  each  LAE,  We  consider  three  solutions:  an  iterative  al 
rithm,  direct  LU  Gaussian-elimination  with  the  back-substitution  performed  optical!-.-  ar.d 
direct  Gaussian-elimination  with  the  hack-substitution  performed  digitally  with  high  accu 
We  consider  two  problems:  the  solution  of  A5X5  =  i>5  for  the  fifth  and  last  outer  loop  in 
(2)  and  (3)  for  the  solution  of  the  ARE  in  (1)  with  A5  and  bs  digitally  calculated  exactl 
and  the  solution  of  all  five  LAEs  for  all  outer  loop  iterations. 


TABLE  2 

Performance  of  Three  Algorithms  for  Two  Data  Sets  in  the  Solution  of  One  System  of  LAE 


ALGORITHM 


(1)  Iterative 


(II)  LU  and 

Optical  Back- 
Substitution 


F100  RESP.  VARIATIONS  ACOUSTIC 
DATA  Point  ATTEN . 

SET  Mods ( % )  Dets ( % )  (dB  /  cm) 


DET  RMS 
NOISE (U 


TABLE  3 

Performance  of  Three  Algorithms  for  Two  Data  Sets  in  the  Solution  of  the  Nonlinear  ARE 


ALGORITHM 


F100  RESP.  VARIATIONS  ACOUSTIC 
DATA  Point  ATTEN. 

SET  Mods ( % )  Dets ( % )  (dB  /  cm) 


NOISE  (%) 


(I)  Iterative 


-x  <%>  :  „av(%) 


In  Table  2,  we  show  the  results  for  the  solution  of  the  single  fifth  set  of  LAEs.  0 u 
results  for  the  full  set  of  five  LAEs,  i.e.  the  full  ARE  solutions  are  included  ir.  Tabic 
Data  sets  3  and  5  refer  to  the  third  and  fifth-order  F100  matrix  problems  respectively, 
performance  measures  used  in  evaluating  each  approach  are  the  average  norm  lx  cf  th 
error  in  the  calculated  vector  x  and  the  maximum  error  raax  m  the  location  of  the  clo 
loop  poles  of  the  final  system.  The  spatial,  detector  noise,  and  acoustic  attenuation 
errors  noted  earlier  were  selected  to  produce  approximately  equal  output  errors  for  car 
error  source  treated  separately. 


In  Tests  1  and  2,  we  see  that  our  theoretical  operational  parameters  (Table  1)  ar<_  a 
valid  when  noise  and  system  errors  are  present.  Comparing  the  results  for  Algorithm  I 
II,  we  see  that  acoustic  attenuation  is  the  dominant  error  source  for  an  iterative  algo 
and  detector  noise  dominates  the  performance  of  a  direct  algorithm.  This  is  expected 
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because  of  the  cyclic  data  flow  of  the  matrix  in  the  AO  cell  during  the  iterative  algorithm. 
This  alters  C  for  the  matrix.  In  the  direct  algorithm,  detector  noise  on  one  cycle  is  fed 
back  to  both  the  inputs  and  the  AO  cell  and  thus  changes  the  noise  distribution  and  its 
effects  accumulate.  Also,  detector  noise  affects  the  small  vector  elements  and  this  effect 
also  compounds  on  successive  cycles.  From  the  results  of  Algorithms  II  and  III,  we  see  that 
optical  back-substitution  yields  comparable  performance  to  digital  back-substitution.  This 
is  expected,  since  the  operations  required  in  back-substitution  are  only  vector  inner  prod¬ 
ucts  and  only  N-l  of  these  are  required.  This  is  a  substantially  lower  computationally  in¬ 
tensive  set  of  operations  than  those  required  in  the  matrix  decomposition.  Thus,  the 
accuracy  of  the  matrix  decomposition  determines  the  final  accuracy  in  our  results.  Comparing 
the  results  for  data  sets  3  and  5  and  the  corresponding  data  in  Tables  2  and  3,  we  see  that 
the  larger  matrix  size  and  the  increased  number  of  steps  required  in  the  ARE  versus  the  LAE 
solution  causes  the  required  accuracy  to  increase  for  direct  algorithms  more  than  for  iter¬ 
ative  algorithms  (e.g.  a  lower  acoustic  attenuation  constant  a  is  noted  to  be  required  for 
the  iterative  ARE  solution  than  for  a  direct  LAE  solution) .  We  have  derived  a  theoretical 
expression  [6] 

a  <  (1/2. 3LC)  (7) 

for  the  amount  of  acoustic  attenuation  a  in  dB/cm  allowed  for  convergence  of  an  iterative 
algorithm,  where  L  is  the  length  of  the  AO  cell  in  cm.  From  the  last  two  columns  in  both 
tables,  we  see  that  AAmax  errors  are  significantly  less  than  Ax  errors  as  expected.  The 
results  in  Tables  2  and  3  are  in  agreement  with  the  theoretical  guidelines  in  (7) .  From 
Test  1  and  all  other  tests,  we  find  that  spatial  errors  are  additive  and  that  for  small  errors 
the  percent  performance  scaled  with  the  magnitude  of  the  error.  In  Tables  2  and  3  and  in 
(7) ,  we  assume  that  each  Tjj  of  the  AO  cell  corresponded  to  1mm  and  we  assumed  new  input  data 
to  the  point  modulators  in  the  AO  cell  to  be  introduced  every  Tg.  To  achieve  more  practical 
j  levels,  closer  spacing  of  data  packets  in  the  cell  is  necessary.  This  can  easily  be 
obtained  by  scaling  the  values  given  in  Tables  2  and  3.  Operation  of  the  input  point  modula¬ 
tors  at  a  higher  rate  than  the  AO  cell  data  [2)  can  also  improve  the  a  and  detector  noise 
values  found  in  Tables  2  and  3.  These  initial  test  results  are  intended  to  provide  guide¬ 
lines  for  the  efficient  use  of  various  algorithms,  efficient  solutions  to  linear  and  non¬ 
linear  matrix  equations,  and  quantitative  data  on  performance  expected.  Our  theory,  guide¬ 
lines,  and  modeling  are  also  appropriate  for  digital-optical  linear  algebra  architectures. 

8,  REAL-TIME  LABORATORY  EXPERIMENTS 

In  Figure  2,  we  show  the  nine  outputs  from  a  laboratory  system  to  iteratively  solve  the 
fifth  set  of  LAEs  for  the  third-order  F100  model  (Test  1,  Table  2).  The  outputs  are  shown 
after  80,  400  and  640  iterations.  The  laboratory  system  used  a  fixed  2-D  photographic  mask 
for  the  matrix  in  place  of  the  AO  cell  and  2-D  space-multiplexing  in  place  of  frequency-mul¬ 
tiplexing.  To  accomodate  bipolar  data,  the  matrix  and  vector  were  biased  positive.  This 
increased  C  to  120.  The  laboratory  system  was  operated  at  a  10MHz  data  rate  per  channel. 

To  facilitate  easy  monitoring  of  the  system,  we  used  w  =  -0.125.  The  number  of  iterations 
]  =  nC  required  for  0.6%  accuracy  was  calculated  from  (3)  to  be  613  iterations.  Our  experi¬ 
mental  value  of  640  iterations  at  which  convergence  occurred  is  thus  in  excellent  agreement 
with  theory.  In  the  laboratory  system,  the  maskerrors  were  ±7.2%  and  these  dominated  other 
spatial  system  errors.  The  detector  noise  was  measured  as  0.4%.  With  these  errors  included 
in  our  simulator ,  the  solution  vector  x  was  calculated,  compared  to  the  ideal  theoretical  x* 
value  and  to  the  x  vector  calculated  on  the  laboratory  system.  The  locations  of  the  closed- 
loop  poles  of  the  system  in  each  case  were  calculated  and  compared.  The  results  in  Table  4 
show  excellent  agreement  (0.5%  accuracy  or  better)  in  the  location  of  the  poles  and  with  the 
nature  of  the  poles  preserved  (e.g.  complex-conjugate  pole  pairs). 


TABLE  4 

Comparison  of  the  Closed-Loop  Poles  Computed  Theoretically  and  Using 
the  Optical  Laboratory  System 


THEORETICAL  POLE 

OPTICAL  LABORATORY 

%  ERROR 

LOCATIONS 

COMPUTED  POLES 

-20.45  +  j6 . 26 

-20.74  +  j  5 . 88 

-20.45  -  j 6 . 26 

-20.74  -  j5 . 88 

1 

-4.53 

-4.53 

mmm 
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Vs 
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(a)  8C  ITERATIONS 


FIGURE  2 

The  nine  photo-detectors  outputs  from  a  fixed  mask  Of.  A I  at  selected  c;vh.  ; 

solution  of  the  syster.  of  LALs  AgX5  =  b/5  that  arise  in  the  final 
the  solution  of  the  nonlinear  7aRE 


9.  SUMMARY  AND  CONCLUSION 

We  have  detailed  a  two-loop  solution  to  the  nonlinear  ARE.  In  the  iterative  so: 
fixed  nur.be r  of  iterations  can  be  employed  to  achieve  a  given  performance  accuracy, 
direct  solution  of  each  LAE  can  also  be  employed,  however  the  iterative  solution  is 
(lCCTg  vs.  9"5Tg)  .  Selection  cf  the  o;  e rational  parameters  for  the  two- loop  algor: 
theoretically  derived,  verified  by  r.cise-frec  simulations  and  shown  tc  be  apprcrria 
system  noise  and  errors  were  present.  The  implementation  of  direct  and  iterative  s 
of  LAEs  cr.  a  frequency-multiplexed  CL7-.F  was  detailed.  A  theoretical  analysis  of  h 
rithms  showed  that  acoustic  attenuation  was  the  dominant  error  source  iterative 
and  detector  :c:se  dominated  direct  algorithms.  Cur  simulations  verified  these  the 
predicticns  an:  quantified  the  performance  obtained  with  each.  Our  theoretical  vai 
the  amount  e :  acoustic  attenuation  allowed  tc  perr.it  convergence  of  an  iterative  al 
wac  verified  ty  s  :m._l  a  t :  ens  .  We  confirmed  am.::  quantified  by  simulations  that  optic 
sutstit  tier,  yields  comparable  pierfcrm.ar.ee  t  its  digital  realization.  Experiments 
cat :-r.  ;r  a  lab  rat -rv  system  was  obtained .  The  guidelines,  and  theory  pro- iced  a 
tr.it'.  fir  vat.  us  ether  syst  lie  process,  rs  (optical  and  digital)  and  for  high-acc 
d.  :  .  t  j  1  -a  1  linear  aiceora  process. rs.  Our  nonlinear  matrix  solution  usir.c  a  f 

a:rr:::.ate  ft  t  realization  or.  any  linear  algebra  processor. 
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Fabrication  and  Testing  of  a  Space  and  Frequency-Multiplexed 
Optical  Linear  Algebra  Processor 

David  Casasent 

Carnegie-Mellon  University 
Department  of  Electrical  and  Computer  Engineering 
Pittsburgh,  Pennsylvania  15213 

ABSTRACT.  A  new  space/frequency-multiplexed  optical  linear  algebra  processor 
is  described.  The  electronic  support  system,  fabrication  of  the  processor 
and  initial  performance  data  are  presented. 


Fabrication  and  Testing  of  a  Space  and  Frequency-Multiplexed 
Optical  Linear  Algebra  Processor 

David  Casasent 

Carnegie-Mellon  University 
Department  of  Electrical  and  Computer  Engineering 
Pittsburgh,  Pennsylvania  15213 


1.  INTRODUCTION 

Optical  linear  algebra  processors  (OLAPs)  represent  a  most  flexible  and 
general-purpose  class  of  optical  system.  In  Section  2,  we  describe  the  archi¬ 
tecture  for  a  space  and  frequency-multiplexed  OLAP.  We  detail  (Section  2)  how 
this  system  accommodates  bipolar  and  complex-valued  data  and  its  use  in  matrix- 
vector  processing.  The  electronic  support  system  is  described  in  Section  3. 

The  optical  system  and  initial  experimental  results  obtained  on  it  are  detailed 
in  Section  4. 

2.  COMPLEX  AND  BIPOLAR  PROCESSOR  ARCHITECTURE 

The  optical  schematic  for  a  new  OLAP  architecture  [1]  to  accommodate  bi¬ 
polar  and/or  complex-valued  matrix  and  vector  data  is  shown  in  Figure  1.  For 
the  case  shown,  the  matrix  A  has  bipolar-valued  elements  and  the  vector  b  has 
complex-valued  elements.  The  bipolar-valued  elements  of  one  row  of  A  are 
spatially-multiplexed  on  two  linear  point  modulator  input  arrays  at  P^  and  the 
complex-valued  elements  of  b  are  encoded  in  the  conventional  three-tuple  repre¬ 
sentation  [2]  frequency-multiplexed  [4]  to  the  acousto-optic  (AO)  cell  at 
This  architecture  uses  input  space-multiplexing  (rather  than  time-multiplexing 
as  in  reference  [3])  together  with  frequency-multiplexing  [4]  to  accommodate 
bipolar  and  complex-valued  matrix  and  vector  data.  If  both  the  matrix  and 
vector  elements  are  comp lex -valued,  three  linear  input  arrays  are  used  at  Pj. 

If  both  the  matrix  and  vector  data  are  bipolar.  Figure  1  can  be  used. 

The  N  point  modulators  per  row  at  Pj  are  imaged  through 
separate  regions  of  Pj, with  the  different  regions  of  P2  separated  in  time  by  Tg 
(the  propagation  time  of  the  acoustic  wave  between  the  different  portions  of  the 
AO  cell  at  P2) •  Each  Tg,  new  input  data  is  entered  at  Pj  and  a  shifted  version 
of  the  P2  vector  is  produced  (with  the  vector-shift  provided  by  the  motion  of 
the  acoustic  wave  with  time).  Thus,  an  N-element  vector  inner  product  is  pro¬ 
duced  each  Tg  and  a  matrix-vector  product  is  computed  each  NTg  (for  an  N  x  N 
matrix).  This  basic  OLAP  architecture  can  solve  linear  and  nonlinear  matrix 
equations.  The  basic  linear  algebra  operation  of  concern  is  the  solution  of  a 
system  of  linear  algebraic  equations.  Various  algorithms  to  achieve  this  on 
such  a  processor  have  been  detailed  elsewhere  [7]. 

The  frequency-multiplexing  requirements  for  such  a  system  were  detailed  in 
Ref.[l].  For  the  M»3  frequency  case  and  the  system  of  Figure  1,  these  require¬ 
ments  are  quite  modest  (Af*70MHz) .  For  the  case  of  a  banded  matrix  with  band¬ 
width  B“M,  the  number  of  input  point  modulators  per  row  is  also  quite  modest. 
When  B  exceeds  the  number  of  input  point  modulators,  partitioning  is  easily 
achieved  as  detailed  elsewhere  (5].  With  a  multi-channel  AO  cell  at  P2»  and 
the  appropriate  data  encoding  and  time-integration  of  the  output,  the  same 


-2- 


architecture  can  achieve  high  accuracy  as  detailed  in  Ref  [5] .  Thus,  this  is 
a  most  attractive,  powerful  and  flexible  OLAP  architecture. 
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FIGURE  1 

New  analog  matrix -vector  space  and  frequency-multiplexed 
architecture  for  complex  and  bipolar-valued  matrix  and  vector  elements. 


ELECTRONIC  SUPPORT  SYSTEM 
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Any  optical  or  digital  linear  algebra  processor  or^ystolic  system  must 
provide  parallel  input  data  to  Pj  (N  words)  and  to  P2  /M  words)  each  Tfi,  plus 
provide  acquisition  and  analysis  of  the  parallel  output  P3  data  each  TB.  To 
achieve  this  with  flexibility  and  programability  and /to  allow  input  data  for 
any  application  to  be  processed  from  a  digital  database,  a  dedicated  high-speed 
microprocessor  system  was  assembled.  This  electronic  support  system  (Figure  2) 
contains  many  special-purpose  boards,  a  hard  disk  (lyM  bytes),  magnetic  tape, 
on-line  multibus  memory  (512K  bytes)  and  processor  memory  (512K  bytes)  with 
600nsec  memory  access,  and  video  (Matrox  interface)  and  graphics  processor  out¬ 
puts.  The  microprocessor  used  at  present  is  an  M68000  with  an  Intel  86/380  RMX 
version  also  under  present  evaluation.  The  general  philosophy  of  this  support 
system  is  to  download  digital  data  from  a  VAX,  magnetic  tape,  etc.  into  high¬ 
speed  parallel  output  buffer  memory  which  drives  parallel  D/As  to  the  Pj  and 
P2  inputs.  Output  data  is  similarly  A/D  converted  in  parallel  and  buffered  in 
an  output  memory.  The  disk  system  provides  storage  of  the  input  and  output 
data.  The  microprocessor  provides  control,  formatting,  etc. 

'  ✓  ,  *  <Kt ' 

To  provid^  tne  parallel  Pj  and  P2  analog  inputs,  fchr^e  special-purpose 

cards  with  f Inre  parallel  output  D/As  (12  bits  at  10MHz)  and  drivers  per  card 
were  fabricated.  With  10  inputs  to  Pi  and  three  inputs  to  P2,  the  system  pro¬ 
cesses  130  M  byte  (12  bit  bytes)  input  data  (1.5  G-bit  per  second  data)  with 
Tg*0.1ysec.  This  represents  a  reasonable  compromise  between  available  D/A 
converters  and  other  hardware  and  system  performance.  Each  D/A  input  is  ob¬ 
tained  from  a  separate  high-speed  parallel  buffer  memory  channel,  each  4K  words 
deep  (12  bit  words).  Three  special-purpose  buffer  memory  boards  with  8  memory 
channels  per  board  have  been  fabricated  and  are  used  for  input  and  output  buf¬ 
fering.  The  P3  outputs  are  detected  (with  special-purpose  20MHz,  low-noise 
circuits),  A/D  converted  (using  special-purpose  circuitry  with  one  A/D  per  board 
with  12  bit  accuracy  and  10MHz  bandwidth),  and  fed  to  a  parallel  input  buffer 
memory.  The  system's  inputs  settle  to  0.2%  in  lOOnsec,  thus  allowing  10MHz  data 
rate  (analog,  bits)  per  channel.  The  necessary  spatial  corrections  [6] 
for  the  PJ  and  P2  transducers  are  obtained  off-line  and  applied  to  the  input 
data.  These  corrections,  plus  input  and  output  bipolar  and  complex  data  normal¬ 
ization  and  encoding,  are  performed  in  software  (with  their  on-line  hardware 
realization  straightforward).  An  interface  board  to  control  the  system,  and  an 
RF  modulator  drive  board  for  the  AO  cell  complete  the  electronic  support  system. 


FIGURE  2 

Photograph  of  the  electronic 
support  system 


FIGURE  3 

Photograph  of  the  laboratory 
optical  matrix-vector  system 


4.  OPTICAL  SYSTEM  FABRICATION  AND  INITIAL  RESULTS 


The  optical  system  of  Figure  1  was  assembled  (Figure  3)  using  a  laser 
diode  (LD)  input  array  at  Pj  with  individual  collimating  optics  integrated 
with  each  LD  source.  The  Pj  outputs  had  a  50%  fill-factor  and  the  full  P^ 

Input  was  reduced  by  a  two  lens  system  to  be  compatible  with  the  size  of  ?2 
and  the  O.lysec  data  packet  spacings.  A  special  input  Pj  mount  was  fabricated 
to  allow  each  Pj  source  to  be  separately  aligned  within  0.3mrad  to  illuminate 
the  correct  region  of  the  AO  cell  at  P2  with  the  necessary  beam  divergence. 

For  the  initial  laboratory  system,  the  beam  reducing  optics  from  Pj  to  P2  occu¬ 
pied  600+20mm  and  the  optics  from  P2  to  P3  required  160mm.  An  even  more  compact 
system  with  folded  optics  is  easily  possible. 

In  Figures  4  and  5,  several  examples  of  the  performance  of  the  system  of 
Figure  3  are  provided.  The  laboratory  system  is  fully  automated  with  data  load¬ 
ing  and  output  display  under  control  of  a  dedicated  terminal  through  the  M68000 
system.  The  inputs  to  3  of  the  Pj  laser  diodes,  the  AO  cell  and  the  output  vector 
inner  product  from  one  detector  are  shown  as  functions  of  time.  The  results  ob¬ 
tained  are  as  expected  and  verify  the  digital  control  and  performance  of  the  full 
hybrid  optical/digital  system. 


FIGURE  4  FIGURE  5 

Three  LD  inputs  (top  3  traces.  Three  LD  inputs  (top  3  traces,  2 

sinewave  and  2  biased  ramps)  and  linear  ramps  and  a  0  input)  and  output 
output  (bottom  trace)  with  a  (bottom  trace,  quadratic  as  expected) 

constant  RF  AO  input  with  the  RF  AO  input  varying  linearly  in 

power  with  time  over  the  duration  of  the 
input  ramps 
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26.  Lockheed  Missiles  &  Space  Co.  -  Sunnyvale,  CA,  "Advanced  Hybrid  Optical/Digital  Pattern 
Recognition" 

27.  OSA  Topical  Meeting  on  Optical  Computing  -  Lake  Tahoe,  NV,  “Fabrication  and  Testing  of  a 
Space  and  Frequency-Multiplexed  Optical  Linear  Algebra  Processor". 

28.  OSA  Topical  Meeting  on  Machine  Vision  -  Lake  Tahoe,  NV,  "Hierarchical  Feature-Based 
Object  Identification". 

29.  OSA  Topical  Meeting  on  Machine  Vision  -  Lake  Tahoe,  NV,  "Correlation  Filters  for 
Distortion-Invariance  and  Discrimination". 


30.  Texas  Instruments  -  Dallas,  TN,  "Optical  Pattern  Recognition". 

April  1985 

31.  Electro-Com  Automation,  Inc.  -  Dallas,  TX,  "Optical  Pattern  Recognition". 


32.  Eglin  Air  Force  Base  -  Ft.  Walton  Beach,  FL,  "Optical  Pattern  Recognition  and  Kalman 
Filtering" . 


May  1985 


33.  Carnegie-Mellon  Lhiiversity  -  Board  of  Trustees,  “Optical  Data  Processing". 

August  1985 


34.  SPIE  -  San  Diego,  CA,  "Correlation  Synthetic  Discriminant  Functions  for  Object  Recognition 
and  Classification  in  High  Clutter". 


35.  SPIE  -  San  Diego,  CA,  "A  Factorized  Extended  Kalman  Filter". 

36.  SPIE  -  San  Diego,  CA,  "Optical  Finite-Element  Processor". 

September  1985 


37.  SPIE  -  Cambridge,  MA,  "Parameter  Estimation  and  In-Plane  Distortion  Invariant  Chord 
Processing" . 

38.  SPIE  -  Cambridge,  MA,  “Optical  Processing  Techniques  for  Advanced  Intelligent  Robots  and 
Computer  Vision". 

39.  SPIE  -  Cambridge,  MA,  "High-Dimensionality  Feature-Space  Processing  with  Computer 
Generated  Holograms". 


18.2.1  THESES  SUPPORTED  BY  AFOSR  FUNDING  (SEPTEMBER  1984  -  SEPTEMBER 
1985 

1.  Eugene  Pochapsky,  M.S.  Dissertation,  “The  Simulation  of  Optical  Pattern  Recognition 
Systems",  September  1984. 

2.  William  Rozzi,  M.S.  Dissertation,  “Advanced  Quantitative  Synthetic  Discriminant  Function 
Tests  on  Ship  Imagery",  December  1984. 

3.  James  Fisher,  M.S.  Dissertation,  "Extended  Kalman  Filter  Algorithms  for  Implementation  on 
a  High-Accuracy  Optical  Processor",  December  1984 

4.  W.T.  Chang,  Ph.D.  Dissertation,  "Chord  Distributions  and  Correlation  SDFs  in  Pattern 
Recognition",  March  1985. 
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